Top Banner
10/28/11 1 YORK OCT 2011 In praise of inconsistency the long tail of small data Alan Dix Talis and Lancaster University www.hcibook.com/alan/ alandix.com/blog YORK OCT 2011 Lancaster University Tiree Talis Tiree Tech Wave 37 Nov
16

In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

Aug 02, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

1  

YORK  OCT  2011  

In  praise  of  inconsistency  the  long  tail  of  small  data    

Alan  Dix  Talis  and  Lancaster  University  

www.hcibook.com/alan/ alandix.com/blog

YORK  OCT  2011  

Lancaster  University  

Tiree  

Talis  Tiree  Tech  Wave  3-­‐7  Nov  

Page 2: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

2  

YORK  OCT  2011  

today  I  am  not  talking  about  …  

•  intelligent  internet  interfaces  •  visualisaHon  and  sampling  

•  situated  displays,  eCampus,  small  device  –  large  display  interacHons  

•  fun  and  games,  virtual  crackers,  arHsHc  performance,  slow  Hme  

•  physicality  and  product  design  •  creaHvity  and  Bad  Ideas  •  modelling  dreams  and  regret  

YORK  OCT  2011  

…  or  even  lots  of  lights  

hNp:/www.hcibook.com/alan/projects/firefly/  

Page 3: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

3  

YORK  OCT  2011  

back  in  the  1980s  ...    Codd  and  all  that  

•  in  theory:  –  normalisaHon,  atomicity  

–  illusion  of  single  use  &  strong  internal  consistency  

•  in  pracHce  –  de-­‐normalise  for  efficiency  

–  maintain  consistency  through  controlled  transacHons  –  business  logic,  APIs  

YORK  OCT  2011  

the  IS  ideal  

transacHonal  update  

mulHple  views  single  central    

repository  

Page 4: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

4  

YORK  OCT  2011  

the  more  things  change  ...  

the  cloud  

API  update  

web-­‐based  views  

YORK  OCT  2011  

...  the  more  they  stay  the  same  

single  central    repository  

transacHonal  update  

mulHple  views  

Page 5: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

5  

YORK  OCT  2011  

limits  of  consistency  

consistency  not  always  possible  

•  distribuHon  and  caching  

•  mulH-­‐user  update  (Alison  and  Brian)  

•  view-­‐based  updates  

hNp://www.perryslingsbysystems.com/trenchers.html  hNp://www.vfridge.com/  

YORK  OCT  2011  

ordering  problems  (race  condiHons)  

Alison Brian

send send

It's a beautiful day Let's go out after work.

I agree totally

It's a beautiful day.  Let's go out after work.  

Alison   It's a beautiful day.  Let's go out after work.  

Alison  

send send

perhaps not, I look awful after the late party

perhaps not, I look awful  after the late party  

Alison   I agree totally  Brian  

send send send send

I agree totally  Brian  perhaps not, I look awful  after the late party  

Alison  

Page 6: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

6  

YORK  OCT  2011  

limits  of  consistency  

consistency  not  always  possible  

•  distribuHon  and  caching  

•  mulH-­‐user  update  (Alison  and  Brian)  

•  view-­‐based  updates  

hNp://www.perryslingsbysystems.com/trenchers.html  

YORK  OCT  2011  

view  based  update  complimentary  funcHons  

view  /  display  

central  state  /  data  base  

D  

S  

v   v  

S’  f  

D’  v(f)  

Page 7: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

7  

YORK  OCT  2011  

view  based  update  complimentary  funcHons  

view  /  display  

central  state  /  data  base  

D  

S  

v   v  

S’  v–1(op)  

D’  op  

YORK  OCT  2011  

...  always  

goal  is  eventual  consistency  

Page 8: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

8  

YORK  OCT  2011  

someHme  not  possible  

distributed  garbage  collecHon  –  various  algorithms  ...    aim  to  make  sure  referenced  items  not  lost  

–  but  always  storagee  node  can  die  

•  opHons:  –  prevent  loss  of  referenced  item  

–  accept  loss  of  referenced  item  •  leases  or  “ref  not  found”  excepHons  

YORK  OCT  2011  

what  is  consistent?  

•  conflicHng  updates  •  long-­‐term  transacHons  

•  synchronisaHon        ...  and  Apple  sHll  can’t  get  it  right!!  

hNp://www.vfridge.com/  

Page 9: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

9  

YORK  OCT  2011  

internal  and  external  consisitency  

•  the  exam  board  ....  

YORK  OCT  2011  

is  the  world  consistent  anyway?  

•  departmental  lists  

Page 10: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

10  

YORK  OCT  2011  

a  different  approach  

do  not  enforce  consistency  

but  highlight  inconsistency  

YORK  OCT  2011  

a  different  approach  

do  not  enforce  consistency  

but  highlight  inconsistency  

•  instead  of  views  of  central  data,    related  yet  different  sources  

•  specify  connecHons  and  automaHcally  check  inform  of  updates,  highlight  discrepancies  but  allow  divergence  

Page 11: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

11  

YORK  OCT  2011  

‘workspace’  

concept  –  workspaces  

department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι

central  insHtuHonal  database  

spreadsheet  on  colleague’s  PC  

table  in  word  doc  on  your  own  PC  

YORK  OCT  2011  

fast  forward  ten  years  ...  

•  semanHc  web  and  RDF  –  open  schema  (but  can  be  specified)  

–  open  world  model  

–  flexible  and  extensible  (e.g.  Volkswagen)  

•  individual  data  sets  –  ontology  engineering  –  gehng  the  model  right  

•  linking  open  data  –  connecHng  web  of  data  –  shared  vocabularies  and  URIs  

Page 12: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

12  

YORK  OCT  2011  

linking  open  data  

YORK  OCT  2011  

linking  open  data  

linking  through:  

•  shared  •  dereferencable  •  URIs  

Page 13: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

13  

YORK  OCT  2011  

fast  forward  ten  years  ...  

•  semanHc  web  and  RDF  –  open  schema  (but  can  be  specified)  

–  open  world  model  

–  flexible  and  extensible  (e.g.  Volkswagen)  

•  individual  data  sets  –  ontology  engineering  –  gehng  the  model  right  

•  linking  open  data  –  connecHng  web  of  data  –  shared  vocabularies  and  URIs  

sounds  familiar?  

YORK  OCT  2011  

the  long  tail  

size  of  data  set  

a  few  very  large  data  sets  e.g.  Open  Govt.,  OS,    geonames,  dbpedia  

the  small  data  of  ordinary  life:  from  local  bus  Hmetables    to  squash  club  league  tables  

Page 14: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

14  

YORK  OCT  2011  

supporHng  small  data?  

•  Google  fusion  tables  •  Google  refine  •  Freebase  •  Talis  Kasabi  

•  mostly  for  ‘middle’  sized  data  

YORK  OCT  2011  

really  small?  

personal,  but  also  Govt.                onen  tables  

describe  semanHcs  rather  than  ‘converHng’  

•  explicit  –  simple  descripHon  

•  implicit  –  semanHcs  through  interacHon  

Page 15: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

15  

YORK  OCT  2011  

explicit  –  3  levels  

•  the  table  as  it  is:  –  there  are  5  columns  col  1  is  called  “name”,  col  2  is  ‘populaHon  

•  internal  semanHcs  of  table  (in  its  dataset)  –  each  row  is  the  properHes  of  a  country  enHty  defined  by  the  ‘name’  column  

•  external  linkage  to  standard  data/vocab  –  rules  +  excepHons  –  country  is  ‘sameAs’  geoname  country  by  matching  name  except  ‘Wales’  is  geoname  administraHve  region  ...    

YORK  OCT  2011  

implicit  

acHon  is  specificaHon:  •  view  a  table  and  give  it  a  name  

•  link  items/columns  from  different  data  sources  

•  perform  calculaHon  

semanHcs  are  emergent  through  use  

Page 16: In&praise&of&inconsistency& the&long&tail&of&small&data€¦ · 10/28/11& 1 YORK%OCT%2011% In&praise&of&inconsistency& the&long&tail&of&small&data% Alan&Dix& Talis&and&Lancaster&University&

10/28/11  

16  

YORK  OCT  2011  

so  ...  

long  history  of  consistency    ...  but  not  always  possible  or  desirable  

do  not  enforce  consistency      but  highlight  inconsistency  

exploit  the  long  tail  of  small  data  

YORK  OCT  2011  

plus  ...  

come  to  Tiree  Tech  Wave  3-­‐7  Nov  2011