Top Banner
Large-scale analysis of bibliometric networks Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University International Conference on Data-driven Discovery: When Data Science Meets Information Science Beijing, China, June 20, 2016
39

Large-scale analysis of bibliometric networks

Jan 18, 2017

Download

Science

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Large-scale analysis of bibliometric networks

Large-scale analysis of bibliometric

networks

Nees Jan van Eck

Centre for Science and Technology Studies (CWTS), Leiden University

International Conference on Data-driven Discovery:

When Data Science Meets Information Science

Beijing, China, June 20, 2016

Page 2: Large-scale analysis of bibliometric networks

Bibliographic databases: ‘Big data’

1

Web of Science Scopus

Journals 12,000 20,000

Publications 45 million 35 million

Citations 1 billion 0.9 billion

Page 3: Large-scale analysis of bibliometric networks

Bibliometric networks

2

Web of

Science

Scopus

Citation network

of pubs / authors / journals

Co-authorship network

of authors / organizations

Co-citation network

of pubs / authors / journals

Co-occurrence network

of keywords / terms

Bibliographic coupling network

of pubs / authors / journals

Bibliographic

database

Page 4: Large-scale analysis of bibliometric networks

Outline

• Software tools

• Network analysis techniques

• Analysis of data science

3

Page 5: Large-scale analysis of bibliometric networks

Software tools

4

Page 6: Large-scale analysis of bibliometric networks

Software tools

• VOSviewer (www.vosviewer.com)

– Tool for constructing and visualizing bibliometric networks

• CitNetExplorer (www.citnetexplorer.nl)

– Tool for visualizing and analyzing citation networks of

publications

• Both tools have been developed together

with my colleague Ludo Waltman 5

Page 7: Large-scale analysis of bibliometric networks

VOSviewer

6

Page 8: Large-scale analysis of bibliometric networks

VOSviewer: Overview

• Software tool for visualizing (bibliometric) networks

• Built-in support for popular bibliographic databases

• Text mining functionality

• Layout and clustering techniques

• Advanced visualization features:

– Smart labeling algorithm

– Overlay visualizations

– Density visualizations (‘heat map’)

• Users:

– Researchers

– Professional users (e.g., universities, libraries, funders,

publishers)7

Page 9: Large-scale analysis of bibliometric networks

Map of university co-authorship

network

8

Page 10: Large-scale analysis of bibliometric networks

Map of journal citation network

9

Page 11: Large-scale analysis of bibliometric networks

CitNetExplorer

10

Page 12: Large-scale analysis of bibliometric networks

• Any type of bibliometric

network

• Co-authorship, direct citations,

co-citation, and bibliographic

coupling

• Time dimension is ignored

• Networks of at most ~10,000

nodes are supported

• Only citation networks of

publications

• Direct citation between

publications

• Time dimension is explicitly

considered

• Millions of publications are

supported

11

VOSviewer CitNetExplorer

Page 13: Large-scale analysis of bibliometric networks

Network

analysis

techniques

12

Page 14: Large-scale analysis of bibliometric networks

Network analysis techniques

13

Layout:

• Assigning the nodes in a network to

locations in a (usually 2d) space

(a.k.a. mapping)

• Visualization of similarities (VOS)

Clustering:

• Partitioning the nodes in a network

into a number of groups (a.k.a.

community detection)

• Weighted modularity

• Smart local moving algorithm

Page 15: Large-scale analysis of bibliometric networks

1414

Clustering can be seen as mapping

in a restricted space

Page 16: Large-scale analysis of bibliometric networks

1515

Clustering can be seen as mapping

in a restricted space

Page 17: Large-scale analysis of bibliometric networks

Unified approach to mapping and

clustering

Minimize

where

n: number of nodes in the network

m: total weight of all edges in the network

Aij: weight of edge between nodes i and j

ki: total weight of all edges of node i

16

ji

ij

ji

ijij

ji

nddA

kk

mxxQ

2

1

2),,(

Mapping

xi: vector denoting the location

of node i in a p-dimensional

space

p

k

jkikjiijxxxxd

1

2

)(

Clustering

xi: integer denoting the

community to which node i

belongs

: resolution parameter

ji

ji

ij

xx

xx

d

if 1

if 0

Page 18: Large-scale analysis of bibliometric networks

Smart local moving algorithm

17

Q = 0.4198

Q = 0.3791

Reduced

network

Local moving

heuristic in

subnetworks

Local moving heuristic

Original

network

Page 19: Large-scale analysis of bibliometric networks

Algorithmically constructed

classification system of science

• 17.8 million publications from the period 2000–

2015 indexed in Web of Science

• 282.4 million citation relations

• Classification system of 3 hierarchical levels:

– 27 broad disciplines

– 817 fields

– 4,113 subfields

18

Page 20: Large-scale analysis of bibliometric networks

Breakdown of scientific literature into

817 fields

19

Social sciences

and humanitiesBiomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 21: Large-scale analysis of bibliometric networks

Publications in scientometrics

subfield

20

Page 22: Large-scale analysis of bibliometric networks

Time-line map of highly cited

scientometrics publications

21

Page 23: Large-scale analysis of bibliometric networks

Analysis of

data science

22

Page 24: Large-scale analysis of bibliometric networks

What is data science?

• Empirical operationalization of data science based

on publications with ‘data’ in title or abstract

23

Wikipedia: “Data Science is an interdisciplinary field

about processes and systems to extract knowledge

or insights from data … which is a continuation of

some of the data analysis fields such as statistics,

data mining, and predictive analytics”

LCDS: “Data Science … deals with finding, analyzing

and validating complex patterns in data. Data

Science methods are indispensable for maintaining a

competitive edge in all disciplines in science”

Page 25: Large-scale analysis of bibliometric networks

Growth of data-driven research

24

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

1990 1995 2000 2005 2010 2015

Percen

tag

e o

f p

ub

licatio

ns

% 'data' publications % 'theory' publications

Page 26: Large-scale analysis of bibliometric networks

Breakdown of scientific literature into

817 fields

25

Social sciences

and humanitiesBiomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 27: Large-scale analysis of bibliometric networks

Data-driven nature of different

scientific fields

26

Social sciences

and humanitiesBiomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

% pub. with ‘data’ in title or abstract

Page 28: Large-scale analysis of bibliometric networks

Data-driven nature of different

scientific fields

27

artificial

intelligence

statisticsbioinformatics

neuroimaging pattern

recognitionastronomy

earthwater

climate

remote

sensing

nutrition

obesity

addiction

accident

analysis

% pub. with ‘data’ in title or abstract

Page 29: Large-scale analysis of bibliometric networks

Data science fields (at least 25% ‘data’

publications)

28

Social sciences

and humanitiesBiomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 30: Large-scale analysis of bibliometric networks

Term map of data science fields

29

Page 31: Large-scale analysis of bibliometric networks

China’s publication output in data

science fields

30

Social sciences

and humanitiesBiomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 32: Large-scale analysis of bibliometric networks

China’s publication output in data

science fields

31

artificial

intelligence

pattern

recognition

high

energy

earth

atmospheres

weatherremote

sensing

Page 33: Large-scale analysis of bibliometric networks

Chinese institutes with most publications

in data science fields (2011-2015)

• Chinese Academy of Sciences

• Peking University

• Tsinghua University

• China University of Geosciences

• Zhejiang University

• Nanjing University

• Shanghai Jiao Tong University

• University of Science and Technology of China

• Beijing Normal University

• University of Hong Kong

32

Page 34: Large-scale analysis of bibliometric networks

CAS publication output in data

science fields

33

earth

atmospheres

weatherremote

sensing

vegetation

astronomy

high energy

Page 35: Large-scale analysis of bibliometric networks

Term map based on CAS publications in

data science fields

34

Page 36: Large-scale analysis of bibliometric networks

CAS (Beijing Branch) publication

output in data science fields

35

astronomy

earth

atmospheres

weatherremote

sensing

vegetation

high energy

Page 37: Large-scale analysis of bibliometric networks

CAS (Shanghai Branch) publication

output in data science fields

36

bioinformatics

genetics

astronomy

nuclear

Page 38: Large-scale analysis of bibliometric networks

Do it yourself!

37

www.vosviewer.com www.citnetexplorer.nl

Page 39: Large-scale analysis of bibliometric networks

Thank you for your attention!

38