Lehrstuhl Informatik 5 (Informationssysteme) Prof. Dr. M. Jarke I5-KL-111010-1 TeLLNet GALA Network Flow and Network Formation: A Social Network Analysis Perspective Ralf Klamma RWTH Aachen University Informatik 5 (DBIS) RWTH Aachen University Ringvorlesung der Research School Business & Economics (RSBE) Siegen June 28, 2011
50
Embed
Network Flow and Network Formation: A Social Network Analysis Perspective
Ringvorlesung der Research School Business & Economics (RSBE) University of Siegen, Germany June 28, 2011
Ralf Klamma RWTH Aachen University
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-1
TeLLNet
GALA Network Flow and Network Formation:A Social Network Analysis Perspective
Ralf KlammaRWTH Aachen University
Informatik 5 (DBIS)RWTH Aachen University
Ringvorlesung der Research School Business & Economics (RSBE) Siegen
June 28, 2011
Vorführender
Präsentationsnotizen
Überschrift in Grau ist für uns intern
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-2
TeLLNet
GALA
Agenda
Netw
ork S
cienc
e
Netw
ork F
low
Netw
ork F
orma
tion
Conc
lusion
s and
Outl
ook
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-3
TeLLNet
GALA
RWTH Aachen University
• 1,250 spin-off businesses have created around 30,000 jobs in the greater Aachen region over the past 20 years.
• IDEA League
• Germany’s Excellence Initiative: 3 clusters of excellence, a graduate school and the institutional strategy “RWTH Aachen 2020: Meeting Global Challenges”
• 260 institutes in 9 faculties as Europe’s leading institutions for science and research
• Currently around 31,400 students are enrolled in over 100 academic programs
• Over 5,000 of them are international students hailing from 120 different countries
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-4
TeLLNet
GALA
Community Information Systems Research Group
Established at DBIS chair, RWTH Aachen University3 Postdocs, 7 PhD students,
+ paid student workers & thesis workers
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-5
TeLLNet
GALA
Network Science
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-6
TeLLNet
GALA
Questions within Network Science How well the position of a agent is to receive and disseminate information?
Who and what effects a agent? – influence networks [Lewis, 2008]
What are groups/communities an agentbelongs to?
– community mining [Clauset et al., 2004]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-7
TeLLNet
GALA
Executive Board Networks: TheyRule.net
A prototype as of 2004 What is the connection between Motorola and Whirlpool?
How does the academic institutes and the companies network look like?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-8
TeLLNet
GALA
Who rule 3M, Motorola, AT&T, Coca-Cola, PepsiCo, and McDonald‘s?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-9
TeLLNet
GALA
Spread of Contagion
Source: orgnet.com
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-10
TeLLNet
GALA
Network Science Paradigms
Merge of analytic and engineering paradigms In an analytic discipline
– To find laws (computing paradigms)– To generate phenomena– To explain observed phenomena
In a engineering discipline– To realize and implement
the paradigms of Networks– To understand the cases when particular technologies should be
used– To store Network data efficiently (Mediabase)
Communicationserves a purpose
Scientific disciplines Commerce
Entertainment Politics
Vorführender
Präsentationsnotizen
(source: CACM paper: pp. 63)
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-11
TeLLNet
GALA
Web Science: The Long Tail & Fragments
The Web is a scale-free, fragmented network– The power law (Pareto-Distribution etc.)– 95 % of users are located in the Long Tail (Communities)– Trust and passion based cooperation
IslandTendrils
IN Continent Central Core OUT Continent
Tunnels
[Barabasi, 2002]
[Anderson, 2006]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-12
TeLLNet
GALA
Principle Analytic Approach Interdisciplinary multidimensional model of networks
– Social network analysis (SNA) is defining measures for social relations
– Actor network theory (ANT) is connecting human and media agents– i* framework is defining strategic goals and dependencies– Theory of media transcriptions is studying cross-media knowledge
social softwareWiki, Blog, Podcast, IM, Chat, Email, Newsgroup, Chat …
i*-Dependencies(Structural, Cross-media)
Members(Social Network Analysis: Centrality,
Efficiency)
network of artifactsMicrocontent, Blog entry, Message, Burst, Thread,
Comment, Conversation, Feedback (Rating)
network of members
Communities of practice
Media Networks
Vorführender
Präsentationsnotizen
Improvement !
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-13
TeLLNet
GALA
MediaBase Collection of Social Software
artifacts with parameterized PERL scripts– Mailing lists– Newsletter– Web sites– RSS Feeds– Blogs
Database support by IBM DB2, eXist, Oracle, ...
Web Interface based on Firefox Plugin, Plone/Zope, Widgets, ...
Strategies of visualization– Tree maps– Cross-media graphs
Klamma et al.: Pattern-Based Cross Media Social Network Analysis for Technology Enhanced Learning in Europe, EC-TEL 2006
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-14
TeLLNet
GALA
Network Flow
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-15
TeLLNet
GALA
Fundamentals: Definitions of Network
A network Γ= (N, L) where N = {1, 2, ..., n} is a (finite) set of nodes (vertices), L⊆ N x N is a set of links (edges)
Assumed: – Unweighted– No multiple links
=> only one link exist between two given nodes=> these two nodes are neighbors or adjacent
– Directed or undirected
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-16
TeLLNet
GALA
Definitions in a Network Degree of a node:
number of incoming and outgoing links A path is a sequence of nodes v0, …, vn-1
with (vi, vi+1) ∈ L, for 0 ≤ i < n-1, A path is a set of connected links
Length of a path : number of links on a path A path is a simple path, if all vertices on a path are pair wise
different A cycle is a path with v0 = vn-1 and length n ≥ 2 A subnetwork of a network Γ= (N, L) is a graph Γ’= (N’,
L’) with N’ ⊆ N und L’ ⊆ L
Fundamentals of networks }:{ LijNjzi ∈∈=
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-17
TeLLNet
GALA
Representation of Networks Adjacency matrix representation
An n x n-dimensional matrix A, where
{ }LjiNji ∈∈≡ ),(:N
1 if (i, j) Laij =
Neighborhood Any network is the collection of neighborhoods
0 otherwise
{ } Ν∈=Γ iiN
∈
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-18
TeLLNet
GALA
Boolean Adjacency Matrix Example For Network Γ1, the adjacency matrix is as follows:
true =1, if there exists a link between two nodes false = 0, otherwise
For any network Γ, its (kth-order) degree distribution p(·) specifies for each k = 0,
1, …, n-1}:{1)( kzNin
kp i =∈=
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-20
TeLLNet
GALA
Network Characteristics:Geodesic Distances
The average geodesic distance d(i, j) is defined as the minimum number of links that connect i and jif no such path exists, d(i, j)=+∞
The distribution specifying the fraction of nodes pairs at distance r
where The average network distance
The diameter of the network
)1(}),(:),{(
)(−
=×∈=
nnrjidNNji
rϖ
1)(0
=∑ >rrϖ
)(rϖϖ
∑ ∞<<=
rrrd
0)(ϖ
}0)(:max{ˆ >= rrd ϖ
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-21
TeLLNet
GALA
Network Characteristics:Density
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-22
TeLLNet
GALA
Network Characteristics:Closeness & Clustering
The total distance The closeness is defined as:
For each node i having at least two neighbors: clustering
For each node j having less than two neighbors
Clustering index of the network Γ
∑ ∈Njjid ),(
∑≡∈Nj
jidic ),(1)(
2)1(
}:{−
∈∧∈∈≡ ii
i
zzLikLijLjk
C
∑=
=n
i
i
n 1
1CC
0=jC
Fundamentals of networks
Vorführender
Präsentationsnotizen
Closeness is a useful measure in solving location problems: the minimum location problem, also called the median problem or service facility location problem Informational implication: spread of information Weak and strong ties [Granovetter, 1973] Strong ties: transitive Weak ties: transitivity is much less common Strategic implications Network closure [Coleman, 1988]
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-23
TeLLNet
GALA
Network Characteristics:Cohesiveness & Betweeness
Given a network Γ= (N, L), let M⊂N, for each nodethe fraction of its connections
The overall cohesiveness of the set M is defined as
if the network Γ is connected the shortest-paths v(j, k) for each j, k and j≠k the betweenness of node i is
Mi∈
ii
zMjLij
M}:{
)(∈∈
=H
)()( min MM i
Mi
HH∈
=
∑≠
≡kj
ii
kjvkjvb),(),(
Fundamentals of networks
Vorführender
Präsentationsnotizen
Intuitively, the centrality of a node measures the importance of this node in bridging the (indirect) connection between other nodes Informational implications A central node is crucial for widespread network communication Jeopardized by the malfunctioning of a central node Becoming congested Strategic implications The very importance of a central node/agent may be translated into an exploitation of this position to its advantage vis-á-vis the remaining nodes
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-24
TeLLNet
GALA
Shortest-path Betweenness: Example
Shortest-path betweenness Nodes A and B will have
high (shortest-path)betweenness in this configuration, while node C will not
∑≠
≡kj
ii
kjvkjvb),(),(
A measure of the extent to which an actor has control over information flowing between others
In a network in which flow is entirely or at least mostly along geodesic paths, the betweenness of a node measures how much flow will pass through that particular node
Fundamentals of networks
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-25
TeLLNet
GALA
Flow Betweenness Flow betweenness of a node i is defined as the amount of
flow through node i when the maximum flow is transmitted from s to t, averaged over all s and t:
While calculating flow betweenness, vertices A and B will get high scores while vertex C will not
∑ >≠≠∈≡
0,,,,)(
stftisiNtsst
stimf f
ifb
Fundamentals of networks
Vorführender
Präsentationsnotizen
Flow betweenness can be thought of as measuring the betweenness of nodes in a network in which a maximal amount of information is continuously pumped between all sources and targets Maximum flow from a given s to all reachable targets t can be calculated in worst-case time O(l2) and hence the flow betweenness for all nodes can be calculated in time O(l2n)
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-26
TeLLNet
GALA
Case: AERCS - Recommendation of Venues for Young Computer Scientists
- 7,385,652 publications- 22,735,240 citations- Over 4 million author’s names
Combination- Canopy clustering [McCallum 2000]- Result: 864,097 matched pairs - On average: venues cite 2306 and
are cited 2037 timesPham, Klamma, Jarke: Development of Computer Science Disciplines – A Social Network Analysis Approach, SNAM, 2011
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-27
TeLLNet
GALA
Properties of Collaboration and Citation Graphs of Venues
Vorführender
Präsentationsnotizen
To gain insights into the above questions, we process as follows: for all the venues, we create their collaboration and citation subgraphs Ga and Gc. [..] By observing the histograms, we are able to understand the nature of computer science venues. The normalized histograms of the four metrics are given in Fig. 9. The metrics reveal some important characteristics of computer science venues. Most of the venues are not narrow and focused on one certain topic, but they are indeed interdisciplinary. This is shown by a large number of low density (Fig. 9a) and low clustering coeffcient (Fig. 9b) citation subgraphs. However, venues also tend to develop a main theme which is the main focused and closely related topics as the core of the venues. That is shown by the large number of big largest connected component venues (Fig. 9c). Now we compare the properties of the collaboration subgraphs and the citations subgraphs. In general, the network metrics are quite similar for both collaboration and citation subgraphs. For example, we observe the same trends for density (Fig. 9a), clustering coefficient (Fig. 9b) and maximum betweenness (Fig. 9d). The betweenness of collaboration subgraphs suggests the existence of the gatekeepers - the key members in every venues, but there are only several important ones, shown by a large number of venues which have low betweenness in collaboration subgraphs. We also observe some differences in largest connected components of collaboration and citation subgraphs (Fig. 9c). While there is a large number of citation subgraphs which have big largest connected component, we observe the opposite trend in the collaboration subgraphs. This observation suggests that there are not so many venues which are successful in stimulating authors to collaborate closely on the main topics, though they are working on the same topics.
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-28
TeLLNet
GALA
User-based CF:Author Clustering
Data: DBLP Perform 2 test cases for the years of 2005
and 2006 - Clustering of co-authorship networks- Prediction of the venue
Case: TeLLNet - SNA for European Teachers‘ Life Long Learning
How to manage and handle large scale data on social networks?
How to analyse social network data in order to develop teachers’ competence, e.g. to facilitate a better project collaboration?
How to make the network visualization useful for teachers’ lifelong learning?
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-36
TeLLNet
GALA
Analysis and Visualization ofLifelong Learner Data
Performance Data on Projects Network Structures and Patterns
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-37
TeLLNet
GALA
Network Formation Strategies
Homophily – love of the same [LaMe54, MSK01]– similar socio-economical status– thinking in a similar way
Contagion– being influenced by others
How to represent strategies for lifelong learner?
Vorführender
Präsentationsnotizen
Informatiker heiraten Informatiker Fireflies synchorinizing their lightening – show video
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-38
TeLLNet
GALA
Game Theory Basics
Every situation as a game [Borel38, NeMo44] A player – makes decisions in a game Players choose best strategies based on payoff
functions Payoffs motivations of players A strategy defines a set of moves or actions a player
will follow in a given game (mixed strategy, pure strategy)
Vorführender
Präsentationsnotizen
The reference should be added to game theory Mixed strategy- if probability over strategis exist, where it is defined how often each move can be played Pure strategy is without any probability
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-39
TeLLNet
GALA
Game Theory
A game is a tuple, where
N is a nonempty, finite set of playersEach player has
1. a set of actions (strategy space) 2. payoff functions3. payoff matrix
NiiNii uANG ∈∈= )(,)(,
Ni∈iA
R→Aui :
Player B chooses white Player B chooses blackPlayer A chooses white 1,1 1,0Player A chooses black 0,1 0,0
Vorführender
Präsentationsnotizen
Games can be cooperative and non-cooperative - anyway in both it concerns of a benefit of a particular player We are interested not in a particular player but in a network of players
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-40
TeLLNet
GALA
Social networks are formed by individual decisions– Cost: write an e-mail– Utility: cooperate with others
Social networks between pupils– Cost: make a joke– Utility: get appreciation from others
Lifelong learner networks– Cost: take a learning course– Utility: find learners with
similar way of reasoning
Network Formation Games
Vorführender
Präsentationsnotizen
Cost of forming Potential utility of linking
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-41
TeLLNet
GALA
Set of agents which are actors of a network. and are typical members of a set
Nash Network : Win-Win Situation Every agent changes its strategy until all agents are satisfied
with their strategies and will not benefit if they changestrategies (the network is stable) Nash equilibrium
A network is a Nash network if each agent is in Nash equilibrium
Chosen strategies defeat others for the good of all players[Nash51, FuTi91]
Vorführender
Präsentationsnotizen
Not only i and my partner benefit – all benefit! each strategy in a Nash equilibrium is a best response to all other strategies in that equilibrium
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-43
TeLLNet
GALA
Epistemic Frame for TeLLNet
• the way how members of a community see themselves in the community• institution role, country
Identity
• tasks, community members perform• languages, subjects, and tools from projects
Skills
• the understanding shared by members of a community• languages, subjects
Knowledge
• beliefs of members• experiences from projects (partners)
Values
• warrants that justify members’ actions as legitimate• quality labels, prizes, European quality labels
Epistemology
Vorführender
Präsentationsnotizen
It is clear that identity for each agent is a teacher. But it is too less for us Values are based on the experiences and the only value that we can‘t take directly as an input into simulation model Community: society Identity: I‘m coming originally from Ukraine Skills: as i‘m working in TEL area i got some ideas about pedagogy: cognetivism, behaviourism, constuctivism Knowledge: computer science background Values: Epistemology: i‘m working at RWTH Aachen as a researcher and doing my PhD
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-44
TeLLNet
GALA
Multi-Agent Simulation System
A multi-agent system is a collection of heterogeneousand diverse intelligent agents that interact with eachother and their environment [SiAi08]– Recommendations
Yenta [Foner97] – looking for users with similar interestsbased on data from Web media
– Market-binding mechanismsLooking for the best item (a reward agent, set of items and users agents) [WMJe05]
– Team formationForming teams for performing a task in dynamicenvironment [GaJa05]
Vorführender
Präsentationsnotizen
There are two types multi-agent decision systems and muti-agent simulation systems In the first all agents make a joint decision Unlike analytical models, a simulation model is not solved but is run and the changes of system states can be observed at any point in time.
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-45
TeLLNet
GALA
Multi-Agent Simulation Questions Which kind of behavior can be expected under arbitrarily
given parameter combinations and initial conditions? Which kind of behavior will a given target system display
in the future? Which state will the target system reach in the future?
[Troitzsch2000]
2008 2009 2010
Vorführender
Präsentationsnotizen
if this theory has been adequately translated into a computer model this would allow you to answer some of the following questions Initial conditions: teachers looking for partners to create projects Given are teachers with set of parameters What happend next – on flip chart! What is the result after a day/a month/ a year? – what are behaviours of teachers/state of the system - of the network We are focusing more on last two questions that deal with predictions, trends. We use agent based simulation technique
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-46
TeLLNet
GALA
Agent Based Simulation
Heterogeneous, autonomous and pro-active actors, such as human-centered systems– Agents are capable to act without human intervention– Agents possess goal-directed behavior– Each agent has its own incentives and motives
Suited for modeling organizations: most work is based on cooperation and communication
[Gazendam, 1993]
Vorführender
Präsentationsnotizen
ABS models allow one to take both last questions into account? Kind of behaviour – yes , what about the state? Each agent’s behaviour is defined by its own set of attribute values which allows to model variation in each individual’s behaviour and the simulation design is decentralised which allows the agents to be pro-active. Show the video with cleaning robot:
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-47
TeLLNet
GALA
Inputs for simulation model Agent =Teacher Teacher properties:
– Languages– Subjects– Country– Institution role– Any Awards? (European Quality Label or Prize)
Project properties:– Languages– Tools– Subjects– Number of pupils in a project– Age of pupils in a project– Any Award? (Quality Label)
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-48
TeLLNet
GALA
Network Formation Game Simulation
Payoff definition: payoff matrix is calculated dynamically based on Epistemic Frame vector:– teachers‘ subjects, subjects of projects (experiences)– teachers‘ languages, languages of projects (experiences)– tools used in projects (experiences)– countries past collaborators are coming from (beliefs)– ...
Strategy definition: homophily or contagiosity Looking for a suitable network for a teacher and not
for a suitable partner!
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-49
TeLLNet
GALA
Nash Equilibrium forNetwork Formation
Finding a Nash Equilibrium (NE) is NP-hard Computer scientists deal with finding appropriate
techniques for calculating NE with a lot of agents We are not interested
in the best solutionbut in a better solution
Lehrstuhl Informatik 5(Informationssysteme)
Prof. Dr. M. JarkeI5-KL-111010-50
TeLLNet
GALA
Conclusions & Outlook Network Science is an interdisciplinary approach between computer
science and other disciplines Mediabase framework based on modeling & reflection support Two case studies
– Network Flow: Analysis and visualization of large digital librariesIdentification of basic flow parameters
– Network Formation: Analysis and visualization of large learner networksPerformance Indicators and Visual Analytics
Application of tools on entrepreneurial problems: Causation and Effectuation (Excellence Project OBIP at RWTH Aachen University)
Researching Network Dynamics by Time Series Analysis and Multi Agent Simulation