Top Banner

Click here to load reader

Learning from past infrastructure to embrace friction and create the Research Data Alliance

Jul 17, 2015


  • Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License

    Learning from past infrastructure to embrace friction and create the Research Data Alliance

    Mark A. Parsons

    Secretary General

    American Geophysical Union

    San Francisco, CA

    16 December 2014

  • Friction is inevitable and necessary in collaboration

    A wheel turns because of its

    encounter with the surface of the road; spinning in the air it

    goes nowhere. cover notes for

    FrictionAn Ethnography of Global Connection

    by Anna Lowenhaupt Tsing

    Marcel DuChamps Bicycle Wheel photo

  • FrictionAn ethnography of global connection

    Actual existing universalisms are hybrid, transient, and involved in

    constant reformulation through dialogue.

    They work out through friction.

  • Coalition PoliticsThere is no reason to think collaborators have common goals.

  • Dynamics of Infrastructure Edwards, et al. 2007 Understanding Infrastructure: Dynamics, Tensions, and Design.

    Infrastructures become ubiquitous, accessible, reliable, and transparent as they mature.

    Systems Networks Inter-networks

    system-building, characterized by the deliberate and successful design of technology-based services.

    technology transfer across domains and locations results in variations on the original design, as well as the emergence of competing systems.

    Finally, a process of consolidation characterized by gateways that allow dissimilar systems to be linked into networks.

  • Research Data Alliance

    Vision Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society.

    Mission RDA builds the social and technical bridges that enable open sharing of data.

  • The Evolution of Data CitationThen

    Back in the day, data were embedded in the literature as tables, maps, monographs, etc.and we cited accordingly.

    Then digital data becomes the norm. Its messier and we forget how to routinely cite.

    Initial efforts to define digital data citation in the late 90s - early 00s Right idea, little traction (or friction) Partially conflated with the citing URLs issue

    A blossoming in the mid-late 00s. Multiple disciplines start developing approaches and guidelines DOI a big driver, especially for publishers and DataCite, but other identifiers

    used too (Handles, LSIDs, UNFs, ARKs and good ol URI/Ls) A somewhat competitive atmospheremore friction.

  • Now a consensus phase Out of Cite, Out of Mind: The Current State of Practice, Policy, and

    Technology for the Citation of Data. 2013.

    Global Joint Declaration of Data Citation Principles. 2014.

    The Evolution of Data CitationNow

  • Implementation phase just begun ESIP Guidelines adopted by a variety of NASA and NOAA data centers, AGU

    and GEOSS. AGU Publishing Committee has author guidelines based on ESIP. Data centers are building relationships with publishers. Several data centers partnering with publishers, e.g. Elseviers article of the

    future. Joint implementation team for the Principles

    It happens locally and requires culture change so debates and friction will continue.

    The Evolution of Data CitationNext

  • A final point from Tsing

    Unity and diversity cover each other up. Need to remember the local.

    This means we must act glocally to succeed.

    Glocalization means the simultaneitythe co-presenceof both universalizing and and

    particularizing tendencies. Roland Robertson

  • Friction is also Where Good Ideas Come From

    The Adjacent Possiblethe importance of local

    Often not Eureka! but rather a slow hunch fading in to view over time.

    Hunches need to collide with other hunches so create that environment. Dont protect IP share it. Connecting vs. protecting

    Sharing of failures as well. Create spaces for that to happen

    virtual and real coffee shops Chance favors the connected mind.

  • What does this mean for RDA?

    1. RDA focusses on developing gateways

    2. RDA doesnt do architecture, but it does provide a level of unity.

  • Deliverables that make data work

    Create - Adopt - Use Adopted code, policy, specifications, standards, or practices that

    enable data sharing

    Harvestable eorts for which 12-18 months of work can eliminate a roadblock

    Eorts that have substantive applicability to groups within the data community but may not apply to all

    Eorts that can start today

    RDA Principles Openness




    Community Driven


  • RDA Working Groups

    1. Brokering Governance

    2. Data Citation WG

    3. Data Description Registry


    4. Data Foundation and Terminology


    5. Data Type Registries WG

    6. Metadata Standards Directory

    Working Group

    7. PID Information Types WG

    8. Practical Policy WG

    9. RDA/CODATA Summer Schools in

    Data Science and Cloud Computing in the Developing World*

    10.RDA/WDS Publishing Data Bibliometrics WG

    11.RDA/WDS Publishing Data Services WG

    12.RDA/WDS Publishing Data Workflows WG

    13.Repository Audit and Certification DSAWDS Partnership WG

    14.Standardisation of Data Categories and Codes WG

    15.The BioSharing Registry: connecting data policies, standards & databases in life sciences*

    16.Wheat Data Interoperability WG

    * in review

  • A basic vocabulary of foundational terminology and query tool to make sure we know what were talking about.

    A data type model and registry (MIME-types for data) to help tools interpret, display, and process data.

    A persistent identifier type registry to help search engines understand what they are pointing to and retrieving.

    Coming soon:

    A basic set of machine actionable rules to enhance trust

    A metadata standards directory so we can describe similar things consistently

    A dynamic-data citation methodology so we can reference precise subsets of changing data.

    Semantically linked terms describing wheat data so we can share harvest and related information around the world

    A unified repository certification scheme to reduce confusion and improve trust.

    Initial Productsadopt one today!

  • What does this mean for RDA?

    1. RDA focusses on developing gateways

    2. RDA doesnt do architecture, but it does provide a level of unity.

    3. RDA plays both globally and locallyThink glocal.

  • Distribution of 2,538 Individual RDA Members in 92 Countries 3 December 2014





    17% Academia


    Map courtesy



    North America


    Austral-pacific 5%

    Africa 3%

    SouthAmerica 1%

    Asia 5%

  • Regional RDAs

    Australian National Data Service, RDA/United States, RDA/Europe,

    Implement RDA deliverables locally and enhance adoption.

    Ensure regional or national issues are addressed globally.

    Support plenaries and support attendance at plenaries.

  • What does this mean for RDA?

    1. RDA focusses on developing gateways

    2. RDA doesnt do architecture, but it does provide a level of unity.

    3. RDA plays both globally and locallyThink glocal.

    4. RDA fosters relationships, interfaces, and connections.

    5. RDA provides a neutral place to identify and work through friction.

  • RDA Interest Groups

    1. Active Data Management Plans IG*

    2. Agricultural Data Interoperability IG

    3. Big Data Analytics IG

    4. Biodiversity Data Integration IG

    5. Brokering IG

    6. Community Capability Model IG

    7. Data Fabric IG

    8. Data for Development

    9. Data in Context IG

    10.Development of cloud computing capacity and

    education in developing world research

    11.Digital Practices in History and Ethnography IG

    12.Domain Repositories Interest Group

    13.Education and Training on handling of research


    14.ELIXIR Bridging Force IG

    15.Engagement IG

    16.Federated Identity Management

    17.Geospatial IG*

    18.Libraries for Research Data*

    19.Long tail of research data IG

    20.Marine Data Harmonization IG


    22.Metadata IG

    23.PID Interest Group

    24.Preservation e-Infrastructure IG

    25.Quality of Urban Life IG

    26.RDA/CODATA Legal Interoperability IG

    27.RDA/CODATA Materials Data, Infrastructure &

    Interoperability IG

    28.RDA/WDS Certification of Digital Repositories IG

    29.RDA/WDS Publishing Data Cost Recovery for

    Data Centres

    30.RDA/WDS Publishing Data IG

    31.Reproducibility IG*

    32.Research data needs of the Photon and Neutron

    Science community

    33.Research Data Provenance

    34.Service Management IG

    35.Structural Biology IG

    36.Toxicogenomics Interoperability IG

    * in review

  • Plenary 5 San Diego, California9 - 11 March 2015

    2013 Pecoff Studios Inc

  • Get involved!

    Join RDA as an individual member supporting our principles at

    Join as an Organisational Member (nominal fee) or an Organisational Aliate (jointly sponsored eorts).

    Initiate or join an