Top Banner
The promises of web scrapping: Mining the web for relational data about artists Journée d’étude ResTo 14 avril 2016 Guillaume Cabanac @gcabanac
25

The promises of web scrapping: Mining the web for relational data about artists

Apr 15, 2017

Download

Science

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The promises of web scrapping: Mining the web for relational data about artists

The promises of web scrapping: Mining the web for relational data about artists

Journée d’étude ResTo14 avril 2016

Guillaume Cabanac

@gcabanac

Page 2: The promises of web scrapping: Mining the web for relational data about artists

2

Web ScrappingThe promises of web scrapping

1. Why?

2. How?

3. Case studies

4. Why not?

Page 3: The promises of web scrapping: Mining the web for relational data about artists

3

Web Scrapping

1. Why?

2. How?

3. Case studies

4. Why not?

The promises of web scrapping

Page 4: The promises of web scrapping: Mining the web for relational data about artists

4

Web Scrapping: Why? Purpose:

Fetch the Impact Factor of indexed journals in Computer Science

Page 5: The promises of web scrapping: Mining the web for relational data about artists

5

Web Scrapping

1. Why?

2. How?

3. Case studies

4. Why not?

The promises of web scrapping

Page 6: The promises of web scrapping: Mining the web for relational data about artists

6

Web Scrapping: How?

Source: http://www.dartlang.org/docs/tutorials/connect-dart-html/

HTML Page Structure: the Document Object Model

Page 7: The promises of web scrapping: Mining the web for relational data about artists

7

Web Scrapping

1. Why?

2. How?

3. Case studies

4. Why not?

The promises of web scrapping

Page 8: The promises of web scrapping: Mining the web for relational data about artists

Study 1: scientists and workaholism

8

Page 9: The promises of web scrapping: Mining the web for relational data about artists

9

Sunday ! Even

on

bank

hol

iday

s in

man

y co

untr

ies!

Study 1: scientists and workaholism

Page 10: The promises of web scrapping: Mining the web for relational data about artists

10

SCRAPStudy 1: scientists and workaholism

Page 11: The promises of web scrapping: Mining the web for relational data about artists

Study 2: networks of references via Google Scholar

11

Page 12: The promises of web scrapping: Mining the web for relational data about artists

12

Study 2: networks of references via Google Scholar

Page 13: The promises of web scrapping: Mining the web for relational data about artists

13

Study 2: networks of references via Google Scholar

Page 14: The promises of web scrapping: Mining the web for relational data about artists

14

SCRAP

Study 2: networks of references via Google Scholar

Page 15: The promises of web scrapping: Mining the web for relational data about artists

15

Study 2: networks of references via Google Scholar

Page 16: The promises of web scrapping: Mining the web for relational data about artists

16

Study 2: networks of references via Google Scholar

Page 17: The promises of web scrapping: Mining the web for relational data about artists

Study 3: The world of arts – work in progress …

17

Page 18: The promises of web scrapping: Mining the web for relational data about artists

18

Study 3: The world of arts – work in progress …

Page 19: The promises of web scrapping: Mining the web for relational data about artists

19

Study 3: The world of arts – work in progress …

Page 20: The promises of web scrapping: Mining the web for relational data about artists

20

Study 3: The world of arts – work in progress …

Result

Page 21: The promises of web scrapping: Mining the web for relational data about artists

21

Web Scrapping

1. Why?

2. How?

3. Case studies

4. Why not?

The promises of web scrapping

Page 22: The promises of web scrapping: Mining the web for relational data about artists

22

Web Scrapping: Why not ?

M Scrapping is usually forbidden

Page 23: The promises of web scrapping: Mining the web for relational data about artists

23

Web Scrapping: Why not ?

M … but things are changing, at least in the UK

http://www.slideshare.net/petermurrayrust/content-mining-at-wellcome-trust

Page 24: The promises of web scrapping: Mining the web for relational data about artists

24

M Data quality issues, especially on Google Scholar?

Web Scrapping: Why not ?

Page 25: The promises of web scrapping: Mining the web for relational data about artists

25

M Data quality issues, especially on Google Scholar?

Web Scrapping: Why not ?