Top Banner
Large-scale analysis of bibliometric data sources Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University 8th LCDS Meeting: Statistics & Data Science Leiden, November 13, 2015
33

Large-scale analysis of bibliometric data sources

Apr 16, 2017

Download

Science

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Large-scale analysis of bibliometric data sources

Large-scale analysis of bibliometric

data sources

Nees Jan van Eck

Centre for Science and Technology Studies (CWTS), Leiden University

8th LCDS Meeting: Statistics & Data Science

Leiden, November 13, 2015

Page 2: Large-scale analysis of bibliometric data sources

About myself

• Master in computer science

• PhD thesis on bibliometric

mapping of science

• Researcher at CWTS since 2009

• Research focus on analysis and

visualization of bibliometric

networks

1

Page 3: Large-scale analysis of bibliometric data sources

Centre for Science and Technology

Studies (CWTS)

• Research center at Leiden University

focusing on science and technology

studies

• About 30 staff members

• History of more than 25 years in

bibliometric and scientometric

research

• Contract research

• Full access to large bibliographic

database (Web of Science and

Scopus)

2

Page 4: Large-scale analysis of bibliometric data sources

Bibliographic databases: ‘Big data’

3

Web of Science Scopus

Journals 12,000 20,000

Publications 45 million 35 million

Citations 1 billion 0.9 billion

Page 5: Large-scale analysis of bibliometric data sources

Bibliometric networks

4

Web of

Science

Scopus

Citation network

of publications

Co-authorship network

of authors / organizations

Co-citation network

of pubs / authors / journals

Co-occurrence network

of terms

Bibliographic coupling network

of pubs / authors / journals

Bibliographic

database

Page 6: Large-scale analysis of bibliometric data sources

Outline

• Software tools

• Network analysis techniques

• Analysis of data science

5

Page 7: Large-scale analysis of bibliometric data sources

Software tools

6

Page 8: Large-scale analysis of bibliometric data sources

Software tools

• VOSviewer (www.vosviewer.com)

– Tool for constructing and visualizing bibliometric networks

• CitNetExplorer (www.citnetexplorer.nl)

– Tool for visualizing and analyzing citation networks of

publications

• Both tools have been developed together

with my colleague Ludo Waltman 7

Page 9: Large-scale analysis of bibliometric data sources

VOSviewer

8

Page 10: Large-scale analysis of bibliometric data sources

Map of university co-authorship

network

9

Page 11: Large-scale analysis of bibliometric data sources

Map of journal citation network

10

Page 12: Large-scale analysis of bibliometric data sources

CitNetExplorer

11

Page 13: Large-scale analysis of bibliometric data sources

Network

analysis

techniques

13

Page 14: Large-scale analysis of bibliometric data sources

Network analysis techniques

14

Layout:

• Visualization of similarities

(VOS)

Community detection:

• Weighted modularity

• Smart local moving algorithm

Page 15: Large-scale analysis of bibliometric data sources

Smart local moving algorithm

15

Q = 0.4198

Q = 0.3791

Reduced

network

Local moving

heuristic in

subnetworks

Local moving heuristic

Original

network

Page 16: Large-scale analysis of bibliometric data sources

Algorithmically constructed

classification system of science

• 16.2 million publications from the period 2000–

2014 indexed in Web of Science

• 241.7 million citation relations

• Classification system of 3 hierarchical levels:

– 28 broad disciplines

– 813 fields

– 3,822 subfields

16

Page 17: Large-scale analysis of bibliometric data sources

17

Breakdown of scientific literature into

813 fields

Social sciences

and humanities

Biomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 18: Large-scale analysis of bibliometric data sources

Publications in scientometrics

subfield

18

Page 19: Large-scale analysis of bibliometric data sources

Time-line map of highly cited

scientometrics publications

19

Page 20: Large-scale analysis of bibliometric data sources

Analysis of

data science

20

Page 21: Large-scale analysis of bibliometric data sources

What is data science?

• Empirical operationalization of data science based

on publications with ‘data’ in title or abstract

21

Wikipedia: “Data Science is an interdisciplinary field

about processes and systems to extract knowledge

or insights from data … which is a continuation of

some of the data analysis fields such as statistics,

data mining, and predictive analytics”

LCDS: “Data Science … deals with finding, analyzing

and validating complex patterns in data. Data

Science methods are indispensable for maintaining a

competitive edge in all disciplines in science”

Page 22: Large-scale analysis of bibliometric data sources

Growth of data-driven research

22

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

1990 1995 2000 2005 2010 2015

Percen

tag

e o

f p

ub

licatio

ns

% 'data' publications % 'theory' publications

Page 23: Large-scale analysis of bibliometric data sources

23

Breakdown of scientific literature into

813 fields

Social sciences

and humanities

Biomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 24: Large-scale analysis of bibliometric data sources

24

Data-driven nature of different

scientific fields

Social sciences

and humanities

Biomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

% pub. with ‘data’ in title or abstract

Page 25: Large-scale analysis of bibliometric data sources

25

Data-driven nature of different

scientific fields

artificial

intelligence

statisticsbioinformatics

neuroimagingpattern

recognitionastronomy

earthwater

weather

climate

remote

sensing

nutrition

obesity

addiction

% pub. with ‘data’ in title or abstract

Page 26: Large-scale analysis of bibliometric data sources

Data science fields (at least 20% ‘data’

publications)

26

Social sciences

and humanities

Biomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 27: Large-scale analysis of bibliometric data sources

Term map of data science fields

27

Page 28: Large-scale analysis of bibliometric data sources

28

Leiden University’s publication output

in data science fields

Social sciences

and humanities

Biomedical and

health sciences

Life and earth

sciences

Mathematics and

computer science

Physical

sciences and

engineering

Page 29: Large-scale analysis of bibliometric data sources

Leiden University’s institutes with most

publications in data science fields

• Leiden Observatory

• LUMC

• Faculty of Archaeology

• Institute of Psychology (FSW)

• Centre for Science and Technology Studies (FSW)

• Mathematical Institute (Science)

• Institute of Biology Leiden (Science)

• Leiden Institute of Advanced Computer Science

(Science)

29

Page 30: Large-scale analysis of bibliometric data sources

LUMC departments with most

publications in data science fields

• Medical Statistics and Bioinformatics

• Rheumatology

• Psychiatry

• Radiology

• Clinical Epidemiology

• Human Genetics

• Neurosurgery

• Cardiology

• Clinical Oncology

• Endocrinology 30

Page 31: Large-scale analysis of bibliometric data sources

Term map based on Leiden University’s

publications in data science fields

31

Page 32: Large-scale analysis of bibliometric data sources

Do it yourself!

32

www.vosviewer.com www.citnetexplorer.nl

Page 33: Large-scale analysis of bibliometric data sources

Thank you for your attention!

33