Top Banner
How to ensure Presto scalability in multi use case Kai Sasaki Treasure Data Inc.
37

How to ensure Presto scalability in multi use case

Apr 15, 2017

Download

Software

Kai Sasaki
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How to ensure Presto scalability in multi use case

How to ensure Presto scalability

in multi use case Kai Sasaki

Treasure Data Inc.

Page 2: How to ensure Presto scalability in multi use case

Kai Sasaki (@Lewuathe)

Software Engineer at Treasure Data Inc.Hadoop/Presto/Spark

Page 3: How to ensure Presto scalability in multi use case

Presto In TD• 150000+ queries / day• 190+ TB processing / day• 10+ MB processing / query * sec• 100+ million processed records / query

Page 4: How to ensure Presto scalability in multi use case

Presto In TD

Prestobase Proxy

PerfectQueue

query Plazmadata

Presto

TD API

BI ToolHTTP

Page 5: How to ensure Presto scalability in multi use case

How to make it scalable• Prestobase Proxy• Node scheduler• Resource Group

Page 6: How to ensure Presto scalability in multi use case

Prestobase proxy

Page 7: How to ensure Presto scalability in multi use case

Prestobase proxy

Prestobase proxy aims to provide the interface especially for BI tools through JDBC/ODBC and also to replace Prestogres.

Page 8: How to ensure Presto scalability in multi use case

Presto In TD

Prestobase Proxy

PerfectQueue

query Plazmadata

Presto

TD API

BI ToolHTTP

Page 9: How to ensure Presto scalability in multi use case

Prestobase proxy

• Written in Scala• Finagle base RPC proxy• Running as Docker container• A user of Airframe• VCR base light-weight test framework

Page 10: How to ensure Presto scalability in multi use case

Finagle

Finagle is an extensible RPC system for the JVM, used to construct high-concurrency servers. Finagle implements uniform client and server APIs for several protocols, and is designed for high performance and concurrency.

see: https://twitter.github.io/finagle/

Page 11: How to ensure Presto scalability in multi use case

Finagle

protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen LastFilter andThen prestoClient

Build request pipeline by binding filter, handlers with Airframe

Page 12: How to ensure Presto scalability in multi use case

Airframe

Airframe is a trait base dependency injection framework using Scala macro

- https://github.com/wvlet/airframe

Page 13: How to ensure Presto scalability in multi use case

Airframe

- Dependency injection tailored Scala- Tagged binding with wvlet https://github.com/wvlet/wvlet

- Object lifecycle management

Page 14: How to ensure Presto scalability in multi use case

Airframeval design : Design = newDesign .bind[X].toInstance(new X) // Bind type X to a concrete instance .bind[Y].toSingleton // Bind type Y to a singleton object .bind[Z].to[ZImpl] // Bind type Z to an instance of ZImpl

import wvlet.airframe._

trait App { val x = bind[X] val y = bind[Y] val z = bind[Z] // Do something with X, Y, and Z}

val session = design.newSessionval app : App = session.build[App]

Page 15: How to ensure Presto scalability in multi use case

VCR testing framework

Record test suite HTTP interaction to make test stable and deterministic

see more detailhttps://testing.googleblog.com/2016/11/what-test-engineers-do-at-google.html

Page 16: How to ensure Presto scalability in multi use case

VCR testing framework

protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen QueryRewriter andThen bind[RequestVCR] andThen prestClient

protected val service: Service[Request, Response] = bind[SomeFilter] andThen bind[AnotherHandler] andThen QueryRewriter andThen bind[NoRecording] andThen prestClient

On CI

On Production

Page 17: How to ensure Presto scalability in multi use case

Prestobase

VCR testing framework

RequestVCRClient

SQLite

Recording

Page 18: How to ensure Presto scalability in multi use case

Prestobase

VCR testing framework

RequestVCRClient

SQLite

Replaying

Page 19: How to ensure Presto scalability in multi use case

Prestobase proxy

Will be open sourced soon

Page 20: How to ensure Presto scalability in multi use case

Node Scheduler

Page 21: How to ensure Presto scalability in multi use case

Node Scheduler

Submitting query follows…- Analyze query AST- Make query logical/physical plan- Schedule each stage

Page 22: How to ensure Presto scalability in multi use case

Node Schedulerquery

stage2 stage1 stage0

task2-0

task2-1

task2-0

task1-0

task1-1

task0-0Table Scan output

Page 23: How to ensure Presto scalability in multi use case

Node Scheduler

NodeScheduler creates NodeSelector that selects worker nodes on which tasks are scheduled. NodeSelector picks up worker nodes when there is available splits.

Page 24: How to ensure Presto scalability in multi use case

Node Scheduler in TD

Keeps worker node map that can be candidate for launching next tasks. - Ignore min candidates - Limit by available memory pool

Page 25: How to ensure Presto scalability in multi use case

Node Scheduler in TD

Back to normal memory pool usage after task is completed.

Page 26: How to ensure Presto scalability in multi use case

Node Scheduler in TD

Challenges- Smoothing CPU time metric- Split type awareness- Avoid problematic worker nodes

Page 27: How to ensure Presto scalability in multi use case

Resource Group

Page 28: How to ensure Presto scalability in multi use case

Resource Group

Resource Group was introduced since 0.147 → https://prestodb.io/docs/current/admin/resource-groups.html

Resource Group aims to limit the resource usage by account/group/query.

Page 29: How to ensure Presto scalability in multi use case

Resource Group

rootGroup

general adhoc

softMemoryLimit: 100%maxQueued : 5000maxRunning : 1000

softMemoryLimit: 100%maxQueued : 100maxRunning : 200

softMemoryLimit: 100%maxRunning : 1000

Page 30: How to ensure Presto scalability in multi use case

Resource Group limits

- maxQueued- maxRunning- softMemoryLimit Following queries will be queued- softCpuLimit Impose penalty against max running queries- hardCpuLimit Following queries will be queued

Page 31: How to ensure Presto scalability in multi use case

Resource Group scheduling

- schedulingPolicy - fair : FIFO - weighted : Selected stochastically - query_priority : Selected according to priority- schedulingWeight

Page 32: How to ensure Presto scalability in multi use case

Resource Group

Every query must be associated to a resource group. The matching can be done by configured selector.

{ "user": “bob", "group": "general" }, { "source": “.*adhoc.*", "group": "global.adhoc.adhoc_${USER}" }

Page 33: How to ensure Presto scalability in multi use case

Resource Group

rootGroup

general adhoc

softMemoryLimit: 100%maxQueued : 5000maxRunning : 1000

softMemoryLimit: 100%maxQueued : 100maxRunning : 200

softMemoryLimit: 100%maxRunning : 1000

Bob’s query

Bob’s query …

Page 34: How to ensure Presto scalability in multi use case

Resource Group DI

Easily change resource group config behavior with Guice injection.

- ResourceGroupConfigurationManager- configure(ResourceGroup, SelectionContext)

- ResourceGroupSelector- match(Statement, SelectionContext)

Page 35: How to ensure Presto scalability in multi use case

SelectionContext

SelectionContext holds the information for associating submitted query.

- Authenticated- User- Source- Query Priority

Currently available as default

Page 36: How to ensure Presto scalability in multi use case

{ "runningQueryIds": ["query1", "query2"], "accountId": 1, "children": [{ "memoryUsage": 12345, "runningQueryIds": [“query1"], "children": [], "runningQueries": 1, "queuedQueries": 0, "maxRunningQueries": 2, "resourceId": "general" }, { "memoryUsage": 26296, "runningQueryIds": ["query2"], "children": [], "runningQueries": 1, "queuedQueries": 0, "maxRunningQueries": 2, "resourceId": "scheduled" }], "runningQueries": 2, "maxRunningQueries": 30,}

Queries in parent group

Running query in general

Running query in scheduled

Page 37: How to ensure Presto scalability in multi use case

RecapDistributed system often requires each component to be stable and scalable. We can make Presto ecosystem reliable by doing…

- Code modification reliability with DI- VCR testing- Multi dimensional resource scheduling- Resource isolation makes multi-tenant distributed SQL engine reliable