Top Banner

of 20

Big data Ch1,2

Jun 02, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/10/2019 Big data Ch1,2

    1/20

  • 8/10/2019 Big data Ch1,2

    2/20

    BigData

    ARevolutionthatwilltransformhow

    welive,workandthink

    KeynotesofChapter12

    ChangchunZhang

  • 8/10/2019 Big data Ch1,2

    3/20

    Now(Chapter

    1)

    Examples:

    Googletook

    the

    50

    million

    most

    common

    search

    termstoidentifyareasinfectedbythefluvirus.

    OrenEtzioni predictsifthepriceofplaneticketis

    increasingor

    decreasing

    in

    the

    future,

    to

    help

    customertodeterminewhentobuytheticket.

    Data

    become

    a

    raw

    material

    of

    business,

    used

    tocreateanewformofeconomicvalue.

  • 8/10/2019 Big data Ch1,2

    4/20

    Lettingthe

    data

    speak

    Thereisnorigorousdefinitionofbigdata.

    Oneway

    to

    think

    about

    the

    issue:

    big

    data

    referstothingsonecandoatalargescalethat

    cannot

    be

    done

    at

    a

    smaller

    one,

    to

    extract

    newinsightsorcreatenewformsofvalue,in

    waysthatchangemarkets,organizations,the

    relationshipsbetween

    citizens

    and

    governments,andmore.

  • 8/10/2019 Big data Ch1,2

    5/20

  • 8/10/2019 Big data Ch1,2

    6/20

    Thingsarespeedingup.Theamountofstored

    informationgrows

    four

    times

    faster

    than

    the

    worldeconomy,whiletheprocessingpowerof

    computersgrowsninetimesfaster.

    Whatdoesthisincreasingmean?

    Bychangingtheamount,wechangetheessence.

    Whenwe

    increase

    the

    scale

    of

    the

    data,

    we

    can

    donewthingsthatwerentpossiblewithsmaller

    amounts.

  • 8/10/2019 Big data Ch1,2

    7/20

    Big data is about predictions.

    Big data is not about trying to teach a

    computer to think like humans.

    Big data is about applying math to hugequantities of data in order to infer probabilities.

    Big data will change fundamental aspects of

    life by giving it a quantitative dimension it

    never had before.

  • 8/10/2019 Big data Ch1,2

    8/20

    More,messy,

    good

    enough

    Big datas ascendancy represents three shifts in

    the way we analyze information that transformhow we understand and organize society:

    Analyze far more data.

    Loosen up our desire for exactitude.

    A move away from the age-old search for

    causality.

  • 8/10/2019 Big data Ch1,2

    9/20

    Inthisnewworldwecananalyzefarmore

    data.Samplingisartificialfetterbeforetheprevalence

    ofhighperformancedigitaltechnologies.

    Usingallthedataletsusseedetailswenever

    couldwhenwewerelimitedtosmallerquantities.

  • 8/10/2019 Big data Ch1,2

    10/20

    Loosenupthedesireforexactitude:

    Withbig

    data,

    well

    often

    be

    satisfied

    with

    asense

    ofgeneraldirectionratherthanknowinga

    phenomenondowntotheinch.

    Whatwe

    lose

    in

    accuracy

    at

    the

    micro

    level

    we

    gaininsightatthemacrolevel.

  • 8/10/2019 Big data Ch1,2

    11/20

    Amoveawayfromtheageoldsearchfor

    causality.Inabigdataworld,wewonthavetobefixated

    oncausality

    Instead,wecandiscoverpatternsandcorrelationsinthedatathatofferusnovelandinvaluable

    insights.

    Thecorrelationsmaynottelluswhyhappening,

    butalertusthatishappening.

  • 8/10/2019 Big data Ch1,2

    12/20

  • 8/10/2019 Big data Ch1,2

    13/20

    Bigdatachangesthenatureofbusiness,

    markets,and

    society.

    Valueshiftedfromphysicalinfrastructureto

    intangiblesandnowisexpandingtodata

    whichisbecomingasignificantcorporate

    asset,avitaleconomicinput,andthe

    foundationof

    new

    business

    models.

  • 8/10/2019 Big data Ch1,2

    14/20

    Theeffectonindividualsmaybethebiggest

    shockof

    all.

    Subjectmatterspecialisthavetocontendwith

    whatthebigdataanalysissays.

    Bigdatawillforceanadjustmenttotraditional

    ideasofmanagement,decisionmaking,

    humanresources

    and

    education.

  • 8/10/2019 Big data Ch1,2

    15/20

    Duetodatasvastsize,decisionsmayoftenbe

    madenot

    by

    humans

    but

    by

    machines.

    Darksideofbigdataareconsidered Privacy,

    Individualvolition,safeguardthesanctityof

    theindividual.

    Newprinciplesareneededfortheageofbig

    data.

  • 8/10/2019 Big data Ch1,2

    16/20

    Summary:

    Harnessingvast

    quantities

    of

    data

    rather

    than

    a

    smallportion,andprivilegingmoredataofless

    exactitude,opensthedoortonewwaysof

    understanding.

    Bigdataalsooverturnstheideaofidentifying

    causalmechanismwhichisselfcongratulatory.

  • 8/10/2019 Big data Ch1,2

    17/20

    More(Chapter

    2)

    Using all the data at hand instead of just a

    portion of it.

    Statisticians have shown that sampling

    precision improves most dramatically withrandomness, not with increased sample size.

    Random sampling has been a hug success and

    is the backbone of modern measurement atscale.

  • 8/10/2019 Big data Ch1,2

    18/20

    Fromsome

    to

    all

    Theconceptofsamplingnolongermakesas

    muchsense

    when

    we

    can

    harness

    large

    amountsofdata.

    Samplinglosesdetail.Inmanycases,ashiftis

    takingplacefromcollectingsomedatato

    gatheringasmuchaspossible.,andiffeasible,

    gettingeverything:

    N

    =all.

  • 8/10/2019 Big data Ch1,2

    19/20

    Random sampling doesnt scale easily to include

    subcategories, as breaking the results down intosmaller and smaller subgroups increases thepossibility of erroneous predictions.

    Sampling also requires careful planning andexecution.

    Big data is not necessarily big in absolute terms.

    What classifies them as big data is that instead ofusing the shortcut of a random sample, data asmuch of the entire dataset as feasible are used.

  • 8/10/2019 Big data Ch1,2

    20/20

    Using all the data instead of a sample isnt

    always necessary. But in an increasing numberof cases using all the data at hand does make

    sense, and doing so is feasible now where

    before it was not.

    Sampling will not be the predominant way we

    analyze large data set. We will aim to go for allthe data.