Alex Wang, Ph.D. Big Data Lab @ Hitachi Co-ranking Mul7ple En77es in Online Enterprise Social Network -- A Big Data Approach
AlexWang,Ph.D.BigDataLab@Hitachi
Co-rankingMul7pleEn77esinOnlineEnterpriseSocialNetwork--ABigDataApproach
© Hitachi America, Ltd. 2016. All rights reserved.
1. Reorganization of the R&D Group
Reorganize the global R&D organization along 3 innovation strategy axes
Co-create services and solutions with customers
Generate innovative products by focusing on strong technology platforms & their deployment
Pioneer new frontiers through vision-driven exploratory basic research
approx. 500
approx. 2,000
approx. 100
Customer-driven 顧客協創
Technology-driven 技術革新
Vision-driven 基礎探索
© Hitachi America, Ltd. 2016. All rights reserved.
2. CSI-NA Organization & Location
CSI-NA 100 HC
Silicon Valley RC: Big Data Analytics/Service PFIT/IoT/UXD, Detroit RC: Automotive, Brazil: Resources
Silicon Valley RC, USCustomer Co-creation
Big Data Analytics
· Energy · Social Media · ITCS · Healthcare · Financial
· Predictive/Prescriptive Analytics · Common Analytics Platform
Focus Verticals
Digital Service Platform (ITPF, IoT and Network) User Experience Design
Detroit RC, USCustomer Co-creation
· Automotive Focus Verticals
Sao Paulo, Brazil Customer Co-creation
· Resources · Agriculture
Focus Verticals
ITCS: Information and Telecommunication System, UXD: User Experience Design
Co-create advanced solutions with customers on common Big Data Analytics Platform
© Hitachi America, Ltd. 2016. All rights reserved.
3. Global Center for Social Innovation (CSI)
Center for Technology Innovation
・Service design ・Ethnography
Co-design method Utilize technology platforms
Identify issues together with customers & provide solutions
� Share vision with
customer
� Generate new
concepts, Develop
prototype & demos
� Develop solution
Same sector
↓
Different sector
deployment
� Proof-of-concept
at customer site
CSI activity CommercializeC
o-cr
eatio
n ap
proa
ch
© Hitachi America, Ltd. 2016. All rights reserved.
4. Evolution of Big Data
Enterprise Data
Human-Generated Data
Machine Data
“50 ZB of data created”
1.3T Tags and sensors
144.8 billion Email messages per day 340 million tweets per day
684,000 bits of content shared on Facebook per day 11 million instant messages per day
72 hours of video uploaded on YouTube per minute
4B people online 31B Devices
“By 2020, there will be more information coming from everywhere”
© Hitachi America, Ltd. 2016. All rights reserved.
5. Evolution of Analytics Sophistication
Sophistication
Va
lue
Information
Knowledge in Advance
Optimization
What happened and Why?
What will happen and When?
What is the best action to take?
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
© Hitachi America, Ltd. 2016. All rights reserved.
6. Analytics: From Data to Actionable Insights
Big Data are high volume, high velocity, high variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight & decision making “at the speed of business”
Data Types & Sources
Machine Data (Realtime
Data)
User Generated
Data
Documentation
Newsfeeds /public
sources
Transaction Data
Big Data Analytics
Descrip(ve
Predic(ve
Prescrip(ve
Better Insights, Better Decisions
Business Outcome
IT X OT powering Big Data Analytics powering Big Decisions
Feedback Loop
© Hitachi America, Ltd. 2016. All rights reserved.
7. Big Data Lab Mission and Strategy
Repeatable solutions delivered as applications or services by HCC, HDS, and other Hitachi companies.
Create innovative social innovation solutions that deliver business value to customers via Hitachi business divisions
Strategies • Big Data Technologies
Key analytics solution IP • Solutions Framework
for developing, deploying, and executing repeatable, high-value analytic applications in multiple industries
• Big Data Solutions PoCs tha t lead to h igh ly leveraged solutions in selected domains (collaboration with HGC-IA, customers/partners)
• Solutions Showcase
Delivery Model
HCC: Hitachi Consulting Company, HDS: Hitachi Data Systems, HGC-IA: Hitachi Global Center for Innovative Analytics, IP: Intellectual Property, PoC: Proof of Concept
© Hitachi America, Ltd. 2016. All rights reserved.
8. Analytics Solutions and Platform & Technologies
Ana
lytic
s fo
r So
cial
Med
ia
Ana
lytic
s fo
r Fi
nanc
ial/R
etai
l
Ana
lytic
s fo
r N
atur
al R
esou
rces
Ana
lytic
s fo
r H
ealth
care
Ana
lytic
s fo
r Po
wer
/Ene
rgy
Ana
lytic
s fo
r Pu
blic
Saf
ety
Ana
lytic
s fo
r In
form
atio
n an
d
Tele
com
mun
icat
ion
ANALYTICS SOLUTIONS
ANALYTICS PLATFORM
and TECHNOLOGIES
• descriptive, predictive, prescriptive analytics, machine learning, deep learning and AI algorithms • simulation • mathematical optimization• information visualization and visual analytics• real-time and streaming analytics
• analytics over unstructured and semi-structured data (text, images, video) • analytics framework for rapid development and deployment • IoT platform architectures• edge/core analytics
© Hitachi America, Ltd. 2016. All rights reserved.
Automotive Power/Energy Power T&D BEMS Connected car
Predictive analytics with occupancy data for building performance efficiency
PMU data analytics for detecting and predicting disturbances on the power grid
Social Media Social Network
Real-time network monitoring, analytics and optimization
Telecom
Risk Stratification, Disease Progression, Triage
Population health
Healthcare
Public Safety Video surveillance
Video surveillance with advanced video analytics (multi-perspective search)
Telematics data analytics for improving life with cars
Oil & Gas Mining
Shale Oil/Gas well production optimization
Overall Operational Efficiency, Dynamic dispatch, predictive maintenance
Natural Resource
Predictive analytics for fleets with telematics data
Fleet management
For fleet management, Mining, Chillers, …
Predictive Maintenance Financial
New financial business models with Fintechs (Blockchain etc.)
9. Customer-driven Big Data Projects
BEMS: Building Energy Management System, PMU: Phasor Measurement Unit, T&D: Transmission and Distribution
Fintech
All Rights Reserved, Copyright(C) 2003, Hitachi, Ltd. -10- Hitachi Confidential
Sentiment Analysis by Hitachi America n Primary Research Goal
u Establishing “PSM (Public Sector Marketing) Process” u Introducing “Voice of the Crowd” to the existing and / or future infrastructures u Developing technologies required for its process
n Case Study – Sentiment Analysis of Public Transportation System Services using Twitter Data u A real data set is collected using Twitter Streaming API u “public transportation” is used as the key word u Classifying data into 3 classes: Positive; Negative; Neutral
Austin, TX 8%
Boston, MA 17%
Chicago, IL 16%
Los Angeles, CA 1%
New York, NY 30%
San Francisco, CA 7%
Washington, D.C. 21%
Major City - Positive User Austin, TX
1%
Boston, MA 13%
Chicago, IL 15%
Los Angeles, CA
11%
New York, NY 49%
San Francisco, CA 5%
Washington, D.C. 6%
Major City - Negative User
All Rights Reserved, Copyright(C) 2003, Hitachi, Ltd. -11- Hitachi Confidential
Social Analytics by Hitachi America
n Primary Research Goal u Analyzing and modeling the social
graphs in enterprise social networks
u Studying the impact of user attributes (eg. geographic locations and his/her ranks) on user interactions
n Case Study – Social Graph Analysis of a large enterprise social network u A real data set is collected using
Jive Software REST API u A directed social graph is built
based user’s “following" relationship
Node Size: In-Degree (Number of Followers)
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 12
• Social media applications such as Jive, Salesforce.com, and Slack have been widely adopted by enterprises.
Enterprise Social Network
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 13
• The importance of a document posted in the Enterprise Social Network (ESN) is different from traditional document collections or Web pages.
• More importantly, it is different from public social networks such as Facebook and Twitter, which view the entire social network as a “flat” graph.
• In most enterprise social media applications, the documents are organized in a hierarchical structure, which may include “group”, “space”, “project”, etc.
• Also, the author of a document plays a critical role in the importance of that document.
The importance of a Document in ESN
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 14
• “who posted it” – A document posted by an influential user may get a better chance to be viewed by others
• “where it posted” – A document that is posted in a popular space may get a better chance to be viewed by others
• The document importance also affects the influences of the users and popularities of the spaces
• We formulate a system of equations that describes the joint relationship among the ranks of documents, users, and spaces, and use its solution to solve the document ranking problem.
Evaluating the importance of a Document in ESN
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 15
• Relationships among Author, Document, and Space
Reinforcing relationship
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 16
• The ESN is modeled as a directed graph based on the “following” relationships between users – The graph G = (V,A) consists of a set V of nodes (vertices) representing user accounts and a set
A of arcs (directed edges) that connect nodes. – The in-degree dI(i) of a node vi, which is the number of arcs into i, is equal to the number of
followers of user i.
Social graph
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 17
• Initial ranking of the users and spaces in the ESN • Initial influence of user i is defined as yi
0 = dI(i) / Nu • dI(i) is the in-degree of a node vi • Nu is the total number of arcs in this graph
• Initial popularity of space s is zs0 = Ns / Nd.
• Nd denote the total number of documents posted in the ESN and Ns the number of documents posted in space s
Initial Ranking
© Hitachi America, Ltd. 2016. All rights reserved. © Hitachi America, Ltd. 2014. All rights reserved. 18
• document importance xd, user influence yi, and space popularity zs
rewrite the above equations in a compact manner By substituting the last two equations into the first, we get where The document importance is then given by the solution to the above linear equation, namely
Calculating the importance
© Hitachi America, Ltd. 2016. All rights reserved.
© Hitachi America, Ltd. 2016. All rights reserved.
20. Common Analytics Framework & Environment
Leveraging Hitachi’s unique experience and IP for customer PoCs
Data Source Billing Supplier Customer Operations Production Maintenance Supply Chain
Dashboard Designer Flow Builder
Application Designer
OT Domain Expert
OT X IT Solution Component
OT X IT Dashboard(s)
All Rights Reserved, Copyright(C) 2003, Hitachi, Ltd. -21- Hitachi Confidential © Hitachi America, Ltd. 2014. All rights reserved.
Ranking documents in online enterprise social network
21
• Social media applications such as Jive, Salesforce.com, and Yammer have been widely adopted by enterprises.
• The importance of a document posted in the Enterprise Social Network (ESN) is different from traditional document collections or Web pages.
• More importantly, it is different from public social networks such as Facebook and Twitter • The entire social network is viewed as a “flat” graph.
• In most enterprise social media applications • The documents are organized in a hierarchical structure
• “group”, “space”, “project”, and etc • The author of a document plays a critical role in the importance of that
document. • Proposing a novel method to rate the importance of every document posted in an
ESN objectively.
All Rights Reserved, Copyright(C) 2003, Hitachi, Ltd. -22- Hitachi Confidential © Hitachi America, Ltd. 2014. All rights reserved.
Summary
22
• To rate the importance of a document posted in an ESN, we consider two major factors:
• “who posted it” • A document posted by an influential user may get
a better chance to be viewed by others • “where it posted”
• A document that is posted in a popular space may get a better chance to be viewed by others
• The document importance also affect the influences of the users and popularities of the spaces.
• We formulate a system of equations that describes the joint relationship among the ranks of documents, users, and spaces, and use its solution to solve the document ranking problem.
S1
S2
D1
D2
D3 D4 D5 D6
A
B
C
Author Document Space
All Rights Reserved, Copyright(C) 2003, Hitachi, Ltd. -23- Hitachi Confidential © Hitachi America, Ltd. 2014. All rights reserved.
Document Ranking Calculation
23
• Initial ranking of the users and spaces in the ESN • Initial influence of user i is defined as yi
0 = dI(i) / Nu • dI(i) is the in-degree of a node vi • Nu is the total number of arcs in this graph
• Initial popularity of space s is zs0 = Ns / Nd.
• Nd denote the total number of documents posted in the ESN and Ns the number of documents posted in space s
A B
C
E
D
• We define the document importance xd, user influence yi, and space popularity zs as the solution to the following system of equations: