基礎から学ぶ 超並SQLエンジンImpala 野 智聡 | Cloudera 株式会社
SQLImpala
| Cloudera
2 2015 Cloudera, Inc. All rights reserved.
( ) Customer Operations Engineer()
3 2015 Cloudera, Inc. All rights reserved.
Impala 2.0 Roadmap
4 Cloudera, Inc. All rights reserved.
Impala
5 Cloudera, Inc. All rights reserved.
Cloudera ImpalaHadoopSQL h>p://impala.io/
6 Cloudera, Inc. All rights reserved.
ImpalaHDFS HBase
Hive ODBC / JDBC Kerberos / LDAP CDH() Cloudera / Oracle / MapR / Amazon
7 Cloudera, Inc. All rights reserved.
Impala Hive
Hadoop
ImpalaHive-
8 Cloudera, Inc. All rights reserved.
Impala -> SQL -> Hive -> (: nested type)
9 Cloudera, Inc. All rights reserved.
SQL on Hadoop Impala
JDBC/ODBC BI/ (: Tableau, Zoomdata, MicroStrategy, QlikView, SAS)
SQL Hive(MapReduce/Spark)
ETL SparkSQL
Spark SQL
CDH5.4Hive on Spark/SparkSQL
10 Cloudera, Inc. All rights reserved.
11 Cloudera, Inc. All rights reserved.
Impala
impalad
catalogd Statestore
impala-shell(command line Client) ODBC / JDBC
ODBC / JDBC
SQL App
Hive
Metastore HDFS NN
State Store catalogd HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
12 Cloudera, Inc. All rights reserved.
Impala Daemon (impalad)HDFSDataNode impalad impalad
impalad
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
13 Cloudera, Inc. All rights reserved.
Catalog Service (catalogd) impaladHDFSHive
impaladDDLHiveMetastore
Hive Metastore HDFS NN
State Store catalogd
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad
DDL
BlockHive
14 Cloudera, Inc. All rights reserved.
StateStore 1 Impalad
catalogd
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
impalad State Store
/
15 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
HDFS DataNode HDFS DataNode HDFS DataNode
16 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
SQL
HDFS DataNode HDFS DataNode HDFS DataNode
17 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
impalad
HDFS DataNode HDFS DataNode HDFS DataNode
18 Cloudera, Inc. All rights reserved.
Impala
Query Exec Engine
Query Coordinator
Query Planner
Query Exec Engine
Query Coordinator
Query Planner
HDFS DataNode
Query Exec Engine
Query Coordinator
Query Planner
ODBC / JDBC
SQL App
HDFS (JOIN)
HDFS DataNode HDFS DataNode
19 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
impalad
HDFS DataNode HDFS DataNode HDFS DataNode
20 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App HiveQL
HDFS DataNode HDFS DataNode HDFS DataNode
21 Cloudera, Inc. All rights reserved.
Disk
MapReduceDisk
Impala
22 Cloudera, Inc. All rights reserved.
23 Cloudera, Inc. All rights reserved.
UDF ()UDF UDAF() Impala C++ UDF Java Hive UDF Python UDF
h>ps://github.com/cloudera/impyla
24 Cloudera, Inc. All rights reserved.
25 Cloudera, Inc. All rights reserved.
100 10
10 1
1000 GB
100 GB
Group A
Group B
26 Cloudera, Inc. All rights reserved.
Impala (Authenecaeon)
Kerberos/LDAP
(Authorizaeon) Sentry(HDFS)
(Audit) Cloudera Navigator
27 Cloudera, Inc. All rights reserved.
/I/O I/O
bzip2
: :
28 Cloudera, Inc. All rights reserved.
Parquet Impala
I/O Impalasnappy
29 Cloudera, Inc. All rights reserved.
HBaseImpala
Impala HBase External systems
put SELECT * FROM hbase_tbl
INSERT / INSERT VALUES get, scan
put/getHadoopNoSQL
ImpalaHBase HDFS
HBase
30 Cloudera, Inc. All rights reserved.
Kudu ParquetHDFSKudu
CDH 5.4
31 Cloudera, Inc. All rights reserved.
2.0
32 Cloudera, Inc. All rights reserved.
Impala 2.0(CDH5.2)
Disk(Disk spill)
SQL 2003Window(RANK, LAG) Where (VARCHAR, CHAR) (VAR_SAMP, VAR_POP)
33 Cloudera, Inc. All rights reserved.
Impala 2.1(CDH5.3)
StateStore
34 Cloudera, Inc. All rights reserved.
Impala 2.2(CDH5.4)
Amazon S3(unsupported)
Cloudera Navigator
35 Cloudera, Inc. All rights reserved.
Roadmap
36 Cloudera, Inc. All rights reserved.
2015
Nested type()
EMC Isilon
37 Cloudera, Inc. All rights reserved.
2015/2016
LlamaYARN
38 Cloudera, Inc. All rights reserved.
2016
20 (mulecore join/runeme/HW)
(nested type/UDF)
/
(Disk Spill) SQL
39 Cloudera, Inc. All rights reserved.
40 Cloudera, Inc. All rights reserved.
Cloudera Impala HadoopSQL
BI/
41 Cloudera, Inc. All rights reserved.
Impala
42 Cloudera, Inc. All rights reserved.
Impala4WebUI Hue
QuickStartVM
Cloud Cloudera Live
Cloudera Manager
43 Cloudera, Inc. All rights reserved.
HueHue HP h>p://gethue.com/ Hue Demo site h>p://demo.gethue.com/ Query Editors Hive/Impala
44 Cloudera, Inc. All rights reserved.
QuickStartVMDownload site h>p://www.cloudera.com/content/www/en-us/downloads/quickstart_vms/5-4.html VMCDH Cloudera Manager(default )8-10GB
45 Cloudera, Inc. All rights reserved.
Cloudera LiveWeb site h>p://www.cloudera.com/content/www/en-us/get-started/cloudera-live.html Cloud (AWS) (m4.xlarge x 4) Tableau/Zoomdata(m4.xlarge +1)60 AWS
46 Cloudera, Inc. All rights reserved.
Cloudera Manager
root(TUI)
Readme
OS
$ curl -O h>p://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin $ chmod 755 cloudera-manager-installer.bin $ sudo ./cloudera-manager-installer.bin
$ sudo ./cloudera-manager-installer.bin --i-agree-to-all-licenses --noprompt --noreadme
47 Cloudera, Inc. All rights reserved.
Cloudera Manager
2 3
Cloudera Manager CDH
1
48 Cloudera, Inc. All rights reserved.
49 Cloudera, Inc. All rights reserved.
ImpalaDocument h>p://www.cloudera.com/content/www/en-us/documentaeon/enterprise/latest/topics/impala.html Impala() Engineer Blog h>p://blog.cloudera.com/ Cloudera Blog. Impala()
50 Cloudera, Inc. All rights reserved.
CDH ()[email protected]
Cloudera ()http://community.cloudera.com/10%
51 Cloudera, Inc. All rights reserved.
Hadoop Hadoop h>p://gihyo.jp/admin/serial/01/how_hadoop_works gihyo.jp Impala201512-20161
52 Cloudera, Inc. All rights reserved.
We are hiring!
Thank you.