Top Banner
Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 1 These slides are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. Analytics Infrastructures for Scientific Communities Michael Derntl RWTH Aachen University Advanced Community Information Systems (ACIS) [email protected] XVI International Symposium on Computers in Education (SIIE 2014) November 12, 2014 Logroño, Spain Parts of the work reported in this presentation have been funded with support from the European Commission. This presentation reflects the views only of the presenter, and the Commission cannot be held responsible for any use which may be made of the information contained therein.
50

Analytics Infrastructures for Scientific Communities

Jul 02, 2015

Download

Technology

Michael Derntl

Keynote presentation at XVI International Symposium on Computers in Education (SIIE 2014), November 12, 2014, Logroño, Spain
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

1 These slides are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Analytics Infrastructures for

Scientific Communities

Michael Derntl

RWTH Aachen University

Advanced Community Information Systems (ACIS)

[email protected]

XVI International Symposium on Computers in Education (SIIE 2014)

November 12, 2014

Logroño, Spain

Parts of the work reported in this presentation have been funded with support from the European

Commission. This presentation reflects the views only of the presenter, and the Commission cannot

be held responsible for any use which may be made of the information contained therein.

Page 2: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

2

Scientific Communities

Scientific results socially created in scientific

communities

Quality of products success of community

Stakeholder interest in success factors

(Kornfeld & Hewitt 1981)

Page 3: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

3

Classic Success Indicators

Scholarly Publications,

Citations, Impact Factors,

Rankings, etc.

→ Established communities

Page 4: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

4

New Publication Channels

© 2012 Intel – Source: http://www.intel.com/content/www/us/en/communications/internet-minute-infographic.html

Web 2.0, social media/networks, etc.

Scattered information and large data volumes – big data

Page 5: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

5

Big Data – The 4 V’s

© 2013 IBM – Source: http://www.ibmbigdatahub.com/infographic/four-vs-big-data

Page 6: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

6

Community IS

Community information systems (CIS) provide environments / infrastructures that support community needs, structures and processes

(Some) Challenges

– Short, disruptive innovation cycles

– Scaling of designs and technology

– Long tail issues: Segmentation, diversification, heterogeneity…

– Vendors seeking lock-in situations

– Sustainability, business models

– Privacy and security

– Increasing collaboration needs

– …

Page 7: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

7

Responsive Open

Community Information

Systems

Community Visualization

and Simulation

Community Analytics

Community Support

Web

An

alytics

Web

En

gin

eeri

ng

Advanced Community Information

Systems (ACIS)

Requirements

Engineering

Page 8: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

8

CIS Aspects in this Talk

CIS Infrastructure

Analytics

Plug & Play

OpennessScaling

Real-Time

Page 9: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

9

Community Analytics

Discovery and communication of insights in data

Weapons

– Social network analysis, Text mining, Topic modeling,

Pattern mining, Deep learning, Visual analytics, …

Infrastructure & CIS

– Analytics as a service

– Visual interaction

– Open APIs

Source: quebit.com

Page 10: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

10

Example: TEL-Map

Future gazing and roadmapping

FP7 Support Action 2011-2013

Methods

– Weak signal analysis

– Trend analysis, Forecasting

– Information visualization

Data sources

– Publications

– Blogosphere

– R&D Projects

http://telmap.org

Page 13: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

13

We invested

hundreds of millions

of Euros! Tell us

what happened with

that money!

How about some

roadmaps, too? And

we need a web portal!

Page 14: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

14

Collaborative Projects –

Network Analysis

Collaborative projects are key in the R&D value

chain

Stakeholders have an interest in the collaboration

structures & dynamics

Existing work on collaboration networks in FP1-6

– Complex scale-free networks; small diameter, high

clustering

– “Oligarchic core” of organizations

(Barber et al 2006; Roediger-Schluga & Barber 2008; Frachisse et al 2008, Breschi & Cusamano 2004; Lozano et al 2007;

Scherngell & Barber 2009; Roediger-Schluga & Dachs 2006; Voigt et al 2011; Derntl & Klamma 2012)

Page 15: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

15

0

100

200

300

400

500

1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

EC funding (million €)

TEL Projects Data Set

eTEN (39) – eLearning

FP6 (32) – TEL

FP7 (52) – TEL

eContentplus (19) – Educ.

147 projects

1020 organizations

PSP (5) – TEL Status: Oct 2014

Page 16: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

16

Projects as Social Networks

Projects × Organizations

Project consortium progression

– Nodes: Projects

– Edges: Overlap of consortia (directed, weighted)

Organizational collaboration

– Nodes: Organizations

– Edges: Collaboration in multiple projects (undirected, weighted)

ROLE

TEL-Map

IMC, RWTH,

OU, ZSI

The Open

University

KU

Leuven

STELLAR, EUROGENE,

ROLE, PROLEARN,

iCOPER, ASPECT

(Frachisse et al 2008)

Page 17: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

17

Consortium

Progression

Overlap ≥ 2

Time diff ≥ 3 months

Nodes 106

Edges 373

Node size proportional

to weighted degree

Node color represents

cluster (Blondel et al 2008)

Page 18: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

18

Project Impact on the Landscape

Measure impact of project consortium members on sustaining and shaping the social project ties after the project start, relative to opportunity.

𝑆𝑝𝑡,𝑘

projects starting t time units after p and having at least k

partners overlap with p

𝐷𝑝𝑡 all potential successor projects of p after t time units

𝐶𝑝 consortium members of p

Successor projects

relative to opportunity

Cumulative fraction

of successor projects

filled up with p's

members

𝛿𝑝 =𝑆𝑝𝑡,𝑘

𝐷𝑝𝑡∙

𝑞∈𝑆𝑝𝑡,𝑘

𝐶𝑝 ∩ 𝐶𝑞

𝐶𝑝Impact

(Derntl & Klamma 2012)

Page 19: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

19

Top Projects by Impact

Mainly large

projects

(networks like

NoEs, BPNs,

Pilots; and IPs)

All programs

(FP6, FP7,

eTEN, PSP,

eC+)

Page 20: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

20

Project Watchlist

Impact correlates positively with – Funding, Consortium size, Betweenness centrality, (Weighted)

in-degree (by size)

Future gazing

Funding m€ ▼wdin(C) wdin/C

OpenDiscoverySpace (2012-15; PSP) 7.7 74 (51) 1.45

Inspiring Science (2013-16; PSP) 4.9 56 (29) 1.93

GALA (2010-14; FP7) 5.7 55 (31) 1.77

WESPOT (2012-15; FP7) 2.9 40 (9) 4.44

GO-LAB (2012-16; FP7) 9.7 38 (19) 2.00

LACE (2014-16; FP7) 1.3 27 (9) 3.00

ITEC (2010-14; FP7) 9.5 22 (27) 0.81

Filter: (Running OR Ended in 2014) AND wdin > 20

(Derntl & Klamma 2012)

Page 21: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

21

Organizational Collaboration

Collaboration is the fertile soil for R&D output in CPs

Follow-up proposals / projects

Shapes the research agenda

Graph:

– Edge between O1 and O2 if both participated in at least k

projects

– Weight: number of projects

– Direction: none

– Nodes: organizations

The Open

University

KU

Leuven

STELLAR, EUROGENE,

ROLE, PROLEARN,

iCOPER, …

Page 22: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

22

Edge only if at least two

common projects

Organizational Collaboration

Network

Page 23: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

23

Organizational Collaboration –

“Oligarchic Core”

Filter: degree ≥ 20

Page 24: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

24

Fine, but…

How about those

NSF projects?

How about those

ARC projects down

under?

Page 25: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

25

Projects Space at

Learning Frontiers Portal

http://learningfrontiers.eu/?q=project_space

Page 26: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

26

Projects Space – Project Details

http://learningfrontiers.eu/?q=tel_project/TEL-MAP

Page 27: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

27

… DB

Project Aspects

(Confolio)

Portal Architecture

Projects DB

Discourse DB

Project Space Module

Query

Service

… DBDashboard

Learning Frontiers Portal

Page 28: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

28

Dashboard

http://learningfrontiers.eu/?q=dashboard

(Derntl, Erdtmann, Klamma 2012)

Page 29: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

29

Embedding in Host Application

Application DataDashboard Container

Dashboard

Visualization

Widget

Application Server

User Data

Widget Data

Data Sources

Database(s)

User

Management

Dashboard

Service

Query

Service

1a

1b3

6b

4a

6a

4b

5c5b

2

5a

Visualization Layer Application Layer Data Layer

1

2

3

4

5

6

Register user (on first visit; automatically done by the embedding application)

Hand over user credentials to the dashboard container

Dashboard container log in user on Application Server

Retrieve list of available visualization widgets

Display visualization in widget

Store user preferences

Component of the

embedding application

Component of the

dashboard framework

Legend

(Derntl, Erdtmann, Klamma 2012)

Page 31: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

31

Text Analysis

Extracting meaning from a (text) data corpus with

machine learning techniques

Building statistical models of text documents

– e.g. Latent Semantic Analysis (pLSA), Latent Dirichlet

Allocation (LDA), n-grams, etc.

Goal: Present results so community can explore

Page 32: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

32

Topic Modeling (LDA)

Image Source: Blei 2012

Generative model: documents are generated by picking words from topics

Page 33: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

33

Topic Model Visualizations

Goal: facilitate user in interpretation and reasoning

based on LDA results

Adopt paradigm of Visual Analytics:

– “Turning information overload into an opportunity”

– Integrate human judgment by means of

– visual representations and

– interaction techniques

in the analysis process

(Keim et al 2008)

Page 34: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

34 Source: http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Spain.html

Example Visualization:

Wikipedia Topics

Page 35: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

35

Topic Dynamics

Assumption: topics and words evolve over time

Various ways of visualization and user interaction

Page 36: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

36

D-VITA: Dynamic Visual Topic

Analytics

http://is.gd/DVITA

(Derntl et al 2014c; Günnemann et al 2013)

Page 37: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

37

Fine, but…

How have the themes at

ER conference evolved in

the last 30 years?

Page 39: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

39

Topic Model Builder

Backend components for configurable toolchains from raw data to built topic model

Page 41: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

41

Maturing

Interacting with People at the

workplacePaul discovers a problem at the

construction site with PLC equipment ...

Generating dynamic Learning

MaterialThe regional training center observes the

Q&A and links it to their course material

...

Q: How to use PLC equipment …?

• I have seen this before here …

• Last time I did it, I …

• Here is something helpful

Social Semantic Layer

Emerging shared meaning,

giving contextEnergy Consumption

Lightning

X3-PVQX3-PJC

X3-POZ PLC EquipmentInstructional Taxonomy

• What is …

• How to …

• Example of …

Tutorial: How to Use PLC

What is PLC

How to use it?

Examples

Further Information

Hot Questions and

Answers

Work Practice Taxonomy

• Installation

• Testing

• Operation

Peter

Paul

Mary

Interacting in the Physical

WorkplacePhysical workplace is equipped with QR

tags, learning materials are delivered just

in time ...

A list of helpful resources

• Tutorials: How to use …

• Persons: Peter, Mary, …

• Work Practice: Installation,..

• Concepts: PLC, Lightning

• Q&A: …,

Learning Layers in the

Construction Industry

Page 44: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

44

Dropping the Box

A

B

C

D

E

LAPPSLayers App Store

(Derntl et al 2014b)

Page 45: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

45

METIS: Scaling Co-Design in Teacher

Communities

Goal: Platform to integrate and support the

complete learning design life cycle

Page 46: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

46

Integrated Learning

Design Environment

Community-based instances; Web

GUI

RESTful APIs for integrating

external tools

Sharing and reuse tracing

features

Used in different educational

contexts

Currently used in Hands-On ICT

MOOC “Design Studio for ICT

based learning activities”

http://ilde.upf.edu(Hernández-Leo et al. 2013, 2014)

Page 48: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

48

Summary

Challenges

– Perpetual change, Tight competition, Big data, Versatile

computing methods

– Need to support community processes, assessing,

reflecting, forecasting, roadmaping community support

infrastructures

Key infrastructure-level aspects of solutions

– State of the art analysis methods, visual analytics

– Embeddable components, scalable architecture

– Openness of processes, software, data

Page 49: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

49

Thank you!

CIS Infrastructure

Analytics

Plug & Play

OpennessScaling

Real-Time

Page 50: Analytics Infrastructures for Scientific Communities

Lehrstuhl Informatik 5

(Information Systems)

Prof. Dr. M. Jarke

50

References

Barber, M., Krueger, A., Krueger, T., Roediger-Schluga, T.: Network of European Union–funded collaborative research and development projects. Physical Review E 73

(2006)

Blei D.M., Lafferty J.D. 2006. Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning, ed.WCohen, A Moore, pp. 113–20. New

York: Assoc. Comput. Mach.

Blei D.M., Ng A.Y., Jordan M.I..(2003) Latent Dirichlet allocation. J. Mach. Learn. Res. 3:993–1022

Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4): 77-84(2012)

Blondel, V. D., Guillaume, J., Lambiotte, R., Lefevre, E.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008 (10)

Breschi, S., Cusmano, L.: Unveiling the texture of a European Research Area: emergence of oligarchic networks under EU Framework Programmes. International Journal of

Technology Management 27(8), 747–772 (2004)

Derntl, M., Klamma, R. (eds.), Hannemann, A., Koren, I., Nicolaescu, P., Renzel, D., Kravcik, M., Shahriari, M., Purma, J., Bachl, M., Bellamy, E., Elferink, R., Tomberg, V.,

Theiler, D., Santos, P. (2014). Customizable Architecture for Flexible Small-Scale Deployment. Learning Layers Deliverable D6.2, October 2014 (2014b).

Derntl, M., Koren, I., Nicolaescu, P., Renzel, D., Klamma, R. (2014a). Blueprint for Software Engineering in Technology Enhanced Learning Projects. In C. Rensing, S. de

Freitas, T. Ley, P. J. Muñoz Merino (Eds.), Open Learning and Teaching in Educational Communities – 9th European Conference on Technology Enhanced Learning, EC-

TEL 2014, Graz, Austria, September 16-19, 2014, Proceedings. Lecture Notes in Computer Science, vol. 8719 (pp. 404-409). (2014a)

Derntl, M., Klamma, R.: The European TEL Projects Community from a Social Network Analysis Perspective. EC-TEL 2012: 51-64

Derntl, M., Erdtmann, S., Klamma, R. (2012). An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Communities. Proceedings of 12th International

Conference on Knowledge Management and Knowledge Technologies, I-KNOW '12, Article No 23. ACM Press.

Derntl, M., Günnemann, N., Tillmann, A., Klamma, R., Jarke, M. (2014c). Building and Exploring Dynamic Topic Models on the Web. In Proceedings of ACM CIKM 2014,

Shanghai, China. ACM Press.

Frachisse, D., Billand, P., Massard, N.: The Sixth Framework Program as an Affiliation Network: Representation and Analysis (2008), http://ssrn.com/abstract=1117966

Keim, D. A., Mansmann, F, Schneidewind, J, Thomas, J., Ziegler, H.: Visual Analytics: Scope and Challenges, pages 76–90.Springer LNCS, 2008

Kornfeld, W. A., Hewitt, C.: The Scientic Community Metaphor. IEEE Trans. Syst., Man, and Cybern., SMC-11(1):24-33, 1981

Lozano, S., Duch, J., Arenas, A.: Analysis of large social datasets by community detection. The European Physical Journal Special Topics 143(1), 257–259 (2007)

Roediger-Schluga, T., Barber, M.J.: R&D collaboration networks in the European Framework Programmes: data processing, network construction and selected results.

International Journal of Foresight and Innovation Policy 4(3/4), 321–347 (2008)

Roediger-Schluga, T., Dachs, B.: Does technology affect network structure? – A quantitative analysis of collaborative research projects in two specific EU programmes.

UNU-MERIT Working Paper Series 041 (2006)

Scherngell, T., Barber, M.J.: Spatial interaction modelling of cross-region R&D collaborations: empirical evidence from the 5th EU framework programme. Papers in Regional

Science 88(3), 531–546 (2009)

Thomas, J., Cook, K.A. (eds.) Illuminating the Path: The Research and Development Agenda for Visual Analytics. IEEE, 2005.

Voigt, C. (ed.): Deliverable D7.5, STELLAR Nework of Excellence (2011)