Top Banner

Click here to load reader

E6893 Big Data Analytics Lecture 11: Linked Big Data ... cylin/course/bigdata/EECS6893-BigData... · PDF fileE6893 Big Data Analytics – Lecture 11: Linked Big Data: ... E6893 Big

Feb 24, 2018

ReportDownload

Documents

doandan

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing

    !E6893 Big Data Analytics Lecture 11: !Linked Big Data Graphical Models (II)

    Ching-Yung Lin, Ph.D.

    Adjunct Professor, Dept. of Electrical Engineering and Computer Science

    Mgr., Dept. of Network Science and Big Data Analytics, IBM Watson Research Center

    November 13th, 2014

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing2

    Course Structure

    Class Data Number Topics Covered

    09/04/14 1 Introduction to Big Data Analytics

    09/11/14 2 Big Data Analytics Platforms

    09/18/14 3 Big Data Storage and Processing

    09/25/14 4 Big Data Analytics Algorithms -- I

    10/02/14 5 Big Data Analytics Algorithms -- II (recommendation)

    10/09/14 6 Big Data Analytics Algorithms III (clustering)

    10/16/14 7 Big Data Analytics Algorithms IV (classification)

    10/23/14 8 Big Data Analytics Algorithms V (classification & clustering)

    10/30/14 9 Linked Big Data Graph Computing I (Graph DB)

    11/06/14 10 Linked Big Data Graph Computing II (Graph Analytics)

    11/13/14 11 Linked Big Data Graph Computing III (Graphical Models & Platforms)

    11/20/14 12 Final Project First Presentations

    11/27/14 Thanksgiving Holiday

    12/04/14 13 Next Stage of Big Data Analytics

    12/11/14 14 Big Data Analytics Workshop Final Project Presentations

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing3

    Potential New Course (TBD): Advanced Big Data Analytics

    Class Data Number Topics Covered

    01/22/15 1 Fundamentals of Parallel and Distributed Computing on Data Analytics

    01/29/15 2 Modern CPU and system architecture for Big Data Analytics

    02/05/15 3 In-Memory Cluster Computing

    02/12/15 4 Data Analysis on Distributed Memory (Spark)

    02/19/15 5 Parallel Programming Language for Multicore (X10)

    02/26/15 6 Scalable Graph Analytics Algorithms (ScaleGraph Analytics)

    03/05/15 7 Large-Scale Bayesian Network and Probabilistic Inference

    03/12/15 8 GPU architectures for Big Data Analytics

    03/26/15 9 Parallel processing and programming for GPU

    04/02/15 10 Analytics Library for GPU

    04/09/15 11 Final Project Proposal Presentation

    04/16/15 12 Large-Scale Deep Learning and Artificial Neural Network

    04/23/15 13 Cognitive Analytics Applications

    04/30/15 14 Big Data Visualizations challenges and solutions

    05/07/15 15 Final Project Presentations

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing4

    Outline

    Overview and Background ! Bayesian Network and Inference ! Node level parallelism and computation kernels ! Architecture-aware structural parallelism and scheduling

    ! Conclusion

    Dr. Yinglong Xia, IBM Watson Research Center

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing5

    Big Data Analytics and High Performance Computing

    Big Data Analytics Widespread impact Rapidly increasing volume of data Real time requirement Growing demanding for High

    Performance Computing algorithms !

    High Performance Computing Parallel computing capability at

    various scales Multicore is the de facto standard From multi-/many-core to clusters Accelerate Big Data analytics

    Anomaly detection

    Ecology Market

    Analysis

    Social Tie Discovery

    Business Data Mining

    Rec

    omm

    enda

    tion

    Behavior prediction

    Twitter has 340 million tweets and 1.6 billion queries a day from 140 million users

    BlueGene/Q Sequoia has 1.6 million cores offering 20 PFlops, ranked #1 in TOP500

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing65

    Example Graph Sizes and Graph Capacity

    Scale

    Throughput

    100T

    10P

    Scale is the number of nodes in a graph, assuming the node degree is 25, the average degree in Facebook

    Throughput isthe tera-number of computationstaking place ina second

    100P

    1E

    10G 100G 1T 1P1G

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing7

    Challenges Graph Analytic Optimization based on Hardware Platforms Development productivity v.s. System performance

    High level developers limited knowledge on system architectures Not consider platform architectures difficult to achieve high performance

    Performance optimization requires tremendous knowledge on system architectures Optimization strategies vary significantly from one graph workload type to another

    Multicore Mutex locks for synchronization, possible false sharing in cache data, collaboration between threads

    Manycore high computation capability, but also possibly high coordination overhead

    Heterogeneous capacity of cores of systems where tasks affiliation is different

    High latency in communication among cluster compute nodes

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing87

    Challenges - Input Dependent Computation & Communication CostUnderstanding computation and comm/synchronisation overheads help improve system

    performance

    The ratio is HIGHLY dependent on input graph topology, graph property Different algorithms lead to different relationship Even for the same algorithm the input can impact the ratio

    Parallel BFS in BSP model Comp/Sync. in thread

    Blue: Comp.Yellow: Sync./Comm

    Different input leads to different ratios

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing9

    Graph Workload Traditional (non-graph) computations

    Example scientific computations Characteristics

    Graph analytic computations Case Study: Probabilistic inference in graphical models is a representative (a) Vertices + edges Graph structure (b) Parameters (CPTs) on each node Graph property (c) Changes of graph structure Graph dynamics

    Example: Matrix A * B

    * Regular memory access * Relatively good data locality * Computational intensive

    Graphical Model Graph + Parameters (Properties) Node Random variables Edge Conditional dependency Parameter Conditional distribution

    Inference for anomalous behaviors in a social network

    Conditional distribution table (CPT)

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing10

    Graph Workload Types

    Type 1: Computations on graph structures / topologies Example converting Bayesian network into junction tree, graph traversal (BFS/DFS), etc. Characteristics Poor locality, irregular memory access, limited numeric operations

    !!!!

    Type 2: Computations on graphs with rich properties Example Belief propagation: diffuse information through a graph using statistical models Characteristics

    Locality and memory access patterndepend on vertex models

    Typically a lot of numeric operations Hybrid workload

    Type 3: Computations on dynamic graphs Example streaming graph clustering, incremental k-core, etc. Characteristics

    Poor locality, irregular memory access Operations to update a model (e.g., cluster, sub-graph) Hybrid workload

    3,1,2

    5,3,4

    6,4,5 11,5,67,3,5

    8,7

    10,79.8

    Bayesian network to Junction tree

    3-core subgraph

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing11

    Scheduling on Clusters with Distributed Memory

    Necessity for utilizing clusters with distributed memory Increasing capacity of aggregated resources Accelerate computation even though a graph

    can fit into a shared memory Generic distributed solution remains a challenge

    Optimized partitioning is NP-hard for large graphs, especially for dynamic graphs

    Graph RDMA enables a virtual shared memory platform Merits remote pointers, one-side operation,

    near wire-speed remote access !

    The overhead due to remote commutation among 3 m/c is very low due to RDMA

    Graph RDMA uses key-value queues with remote pointers

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing12

    Explore BlueGene and HMC for Superior Performance BlueGene offers super performance for large scale graph-based applications

    4-way 18-core PowerPC x 1024 compute nodes 20 PFLOPS Winner of GRAPH500 (Sequoia, LLNL) Exploit RDMA, FIFO and PAMI to achieve high performance !!!!!!!!!

    Hybrid Memory Cube for Deep Computing Parallelism at 3 levels Data-centric computation Innovative computation

    model

    Inference in BSP model 0

    31

    24

    Visited

    Frontier

    New Frontier

    Not Reached

    Propagate Belief to BN nodes in Frontier

    Local comp.

    converge?

    Global comm.

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing13

    Outline

    Overview and Background ! Bayesian Network and Inference ! Node level parallelism and computation kernels ! Architecture-aware structural parallelism and scheduling

    ! Conclusion

  • 2014 CY Lin, Columbia UniversityE6893 Big Data Analytics Lecture 11: Linked Big Data: Graph Computing14

    Large Scale Graphical Models for Big Data Analytics

    Big Data Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.