Transcript

Big Data for the Social Sciences:The Landscape for Web ObservatoriesDavid De Roure, Strategic Adviser for Data Resources @dder

Overview

1. Big Data for research (UK perspective)

2. Social Media Data is distinctive

3. A series of shifts in how scholarship is conducted

4. And hence the context for Web Observatories

Big Data doesn’t respect disciplinary boundaries

Digital Social Research

Edwards, P. N., et al. (2013) Knowledge Infrastructures: Intellectual Frameworks and Research Challenges. Ann Arbor: Deep Blue. http://hdl.handle.net/2027.42/97552

theODI.org

Mandy Chessell

The Big Picture

More people

More

mach

ines

Big DataBig Compute

Conventional Computation

“Big Social”Social Networks

e-infrastructure

onlineR&D

Big Data Production& Analytics

deeplyaboutsociety

The f

utu

re

RCUK and Big Data▶ ‘Big data is a term for a collection of

datasets so large and complex that it is beyond the ability of typical database software tools to capture, store, manage, and analyse them. ‘Big’ is not defined as being larger than a certain number of ‘bytes’ because as technology advances over time, the size of datasets that qualify as big data will also increase’ (RCUK)

Research benefits of new data▶Undertaking research on pressing policy-related

issues without the need for new data collection

• Food consumption, social background and obesity

• Energy consumption, housing type and climatic conditions

• Rural location, private/public transport alternatives and incomes

• School attainment, higher education participation, subject choices, student debt and later incomes

▶New data such as social media enable us to ask big questions, about big populations, and in real time – this is transformative

Big Data Network

Phase 1 and 2

Research questions– Social and political

movements

– Political participation and trust

– Individual, group/community and national identities

– Personal, local, national and global security (including crime, law enforcement and defence)

– Rural development and ‘Urban Transformations’

– Crisis prevention, preparedness, response, management and recovery

– Education

– Health and wellbeing (including ageing)

– Environment and sustainability

– Economic growth and financial markets (including employment and the labour market)

E-i

nfr

ast

ruct

ure

Leaders

hip

C

ounci

l

Mandy Chessell

F i r s t

Interdisciplinary and “in the wild” *

* “in it” versus “on it”

Nigel Shadbolt et al

Real life is and must be full of all kinds of social constraint – the very processes from which society arises. Computers can help if we use them to create abstract social machines on the Web: processes in which the people do the creative work and the machine does the administration... The stage is set for an evolutionary growth of new social engines. The ability to create new forms of social process would be given to the world at large, and development would be rapid.

Berners-Lee, Weaving the Web, 1999 (pp. 172–175)

The Order of Social Machines

SOCIAM: The Theory and Practice of Social Machines is funded by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant number EPJ017728/1 and comprises the Universities of Southampton, Oxford and Edinburgh. See sociam.org

A revolutionary idea…Open Science!

Join the W3C Community Group www.w3.org/community/rosc

Jun Zhao

www.researchobject.org

Web as

lensWeb as artefact

Web as

infrastructure

Web Observatorieshttp://www.w3.org/community/webobservatory/

Big data elephant versus sense-making network?

The challenge is to foster the co-constituted socio-technical system on the right i.e. a computationally-enabled sense-making network of expertise, data, models, visualisations and narratives

Iain Buchan

Pip WillcoxPip Willcox

Pip Willcox

The Observatory Context▶New forms of data enable us answer old

questions in new ways and to address entirely new questions– Especially about (new) social processes

▶There are multiple shifts occurring:– Academia and business– Volumes and velocity of data– Realtime analytics– Computational infrastructure– Dataflows vs datasets (and curation

infrastructure)– Correlation vs causation– Increasing automation and ethical implications– Machine-to-Machine in Internet of Things

david.deroure@oerc.ox.ac.uk

www.oerc.ox.ac.uk/people/dder

@dder

Slide and image credits: Fiona Armstrong, Christine Borgman, Iain Buchan, Mandy Chessell, Neil Chue Hong, Cat De Roure, Kevin Page, Nigel Shadbolt, Pip Willcox, Jun Zhao, Guardian newspaper

www.oerc.ox.ac.uk

david.deroure@oerc.ox.ac.uk@dder

top related