Multi-Tenant Hadoop-as-a-Service (for free!) Jim Dowling Associate Prof @ KTH Senior Researcher @ SICS CEO @ Hops AB SHUG Meetup, Stockholm, April 21 st 2016 www.hops.io @hopshadoop (Some Slides by Prof. Tor Björn Minde, CEO SICS North Swedish ICT AB)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Multi-Tenant Hadoop-as-a-Service (for free!)
Jim Dowling Associate Prof @ KTH
Senior Researcher @ SICSCEO @ Hops AB
SHUG Meetup, Stockholm, April 21st 2016
www.hops.io @hopshadoop
(Some Slides by Prof. Tor Björn Minde, CEO SICS North Swedish ICT AB)
• World’s First Open Data Centre for Big Data in Luleå
• Metadata in Hadoop
• True Multi-Tenancy for Hadoop
• DEMO: Spark/Flink/Hadoop-as-a-Service
4
Vision SICS ICE research facilityA 2 MW datacenter research and test environment
Purpose: Increase knowledge, strengthen universities, companies and researchers
R&D institute, 5 lab modules, 3-4000 servers, 2-3000 square meters
5
What SICS ICE will offer1. Compute capacity and tools for big data and cloud
• Hadoop/Spark/Flink-as-a-Service
2. Demonstration space for new products & solutions
3. Datacenter infrastructure for experiments and facility data• Flexible lab modules and re-configuration• Measurement equipment for energy, cooling, capacity
4. Competence for verticals and datacenter infrastructure
6
Status of SICS-ICE research facility(ICE = Infrastructure and Cloud research Environment)
Phase 1 (1 room built)• Establish test projects in a “room-in-
room” commercial co-location facility • Start of operation February 2016• Officially Launched in April 2016
Phase 2 (Design phase) • Design of a flexible and general research
facility summer-fall 2016• Contracts with Akademiska Hus & E.ON• Plan is to start build phase Spring 2017• Plan is to start installation fall 2017• Plan is to start operation early 2018
7
Phase 1 room-in-room module 1
8
A Data Center Optimized for Hadoop
Dell servers from Hi5 in module 1
• 3600 cores• 40 TB RAM• Up to 7.5 petabyte storage• 10/40 Gb/s network• Separate management network
Hadoop-as-a-Service on SICS ICE
9
But First…. MetaData in Hadoop
10
Metadata Totem Poles in Hadoop
11Eventual Consistency
12
With Many Hadoop Clusters
Cluster 1 Cluster N
MetaDataService
MetaDataService
MetaData Service (Aggregator)
Eventually consistent MetaData aggregated using moreeventually consistent protocols.
MetaData in Hops Hadoop
HDFSYARN
NDB
ProjectsDataSets
Users
ProvenanceSearch
HistoryCustomMetaData
13
Case Study: Access Control as a MetaData Service
14
15
Access Control in Relational Databases# Multi-tenancy for alice and bob on db1 and db2
grant all privileges on db1.* to ‘alice'@‘%‘;grant all privileges on db2.* to ‘bob'@‘%‘;
#More fine-grained privilegesgrant SELECT privileges on db2.sensitiveTable to ‘alice'@‘192.168.1.2‘;
Databases ensure the consistency of security and policies using foreign keys.