Federated identity and scholarly identity - a match made in heaven? Gudmundur A. Thorisson, PhD <[email protected]> Research associate, University of Leicester Guest scientist, University of Iceland Participant in the GEN2PHEN Consortium and the ORCID Technical Working Group This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/ ) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given. TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012 ??????
48
Embed
TNC2012 Federated and scholarly identity - match made in heaven?
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Federated identity and scholarly identity - a match made in heaven?Gudmundur A. Thorisson, PhD <[email protected]> Research associate, University of LeicesterGuest scientist, University of IcelandParticipant in the GEN2PHEN Consortium and the ORCID Technical Working Group
This work is published under the Creative Commons Attribution license (CC BY: http://creativecommons.org/licenses/by/3.0/) which means that it can be freely copied, redistributed and adapted, as long as proper attribution is given.
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Biological research too is increasingly “Big” and data-driven
‣ From: small-scale datasets that fit into a printed journal article
Richards, M. et al. Paleolithic and neolithic lineages in the European mitochondrial gene pool. American journal of human genetics 59, 185-203 (1996). http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1915109/
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
‣ To: large-scale collection of biological data in digital form
‣ Huge technological advances in last 5-10 yearsexperimental / observations <-- gathering data with high-throughput equipmentcomputer technology <-- storing & analyzing massive data volumes
‣ Example: massively-parallel sequencingDetermine human genome sequence in <1 day - the $1000 genomeMetagenomics: sequence *everything* in environment samplesLarge bio-specimen collections x100,0000 of individuals in disease/population biobanks
Biological research too is increasingly “Big” and data-driven
4
Prof Anthony J Brookes GEN2PHEN coordinator
Chair, Bioinformatics and GenomicsDepartment of Genetics
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Identifying contributors
‣ Why? So we can.. - Attribution - link content creators with their works and attribute credit
appropriately
- Discovery - who contributed to publication X? which publications has person/organization Y contributed to?
‣ What kind of contributions?
- Characterizing ‘contributorship‘: role: author, creator, analyst, reviewer contribution: ‘conceived of study & designed experiment’, ‘wrote paper’, ‘performed experiments’
‣ LHC example: ~2000 ‘authors’ and ~170 institutions
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Problem #1: name ambiguity
Are these authors all the same person?G. Thorisson, University of LeicesterG. A. Thorisson, University of LeicesterG. A. Thorisson, Cold Spring Harbor Laboratory
How about these?
Or these?
J. SmithJ. SmithJ. SmithJ. SmithJ. Smith [...]
[..] ∼2/3 of the ∼6 million authors in MEDLINE share a last name and first initial with at least one other author, and an ambiguous name refers to ∼8 persons on average.
Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data (2009) vol. 3 (3)
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Problem #1: name ambiguity
‣ Number of authors and other scholarly contributors is increasing
‣ Number & kinds of “works” they contribute to is increasing
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Problem #1: name ambiguity
‣ Number of authors and other scholarly contributors is increasing
‣ Number & kinds of “works” they contribute to is increasing
‣ The scholarly record is broken
‣Reliable attribution of authors and contributors is impossible without unique person-level identifiers
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Problem #2: digital identity crisis‣ Session title: Scientific Schizophrenia - How many identities do
YOU have?
‣ Well, I have several! <-- identity crisis?? - 2x Universities I’m affiliated with
- Several scholarly/professional profile services
- LinkedIn professional profile / CV
- Twitter microblogging (for professional purposes)
- Several other author profiles that are not under my control (Web of Science, Scopus, others)
‣ Identity fragmentation - big, big mess!!!!
How to Make a Tackle in RugbyTackling in rugby is one of the most important aspects of the game.[...]
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
The Open Researcher & Contributor ID initiative
‣ ORCID is an international, interdisciplinary organization involving multiple stakeholders:
- Research institutions, libraries, funding organizations, publishers, intermediares and individual researchers
‣ Started in late 2009 to solve the name ambiguity problem in scholarly communication.
‣ Incorporated as a non-profit with a Board of Directors in August 2010.
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
The Open Researcher & Contributor ID initiative
ORCID will work to support the creation of a permanent, clear and unambiguous record of scholarly communication by enabling reliable attribution of authors and contributors through unique identifiers
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
ORCID Participants
ORCID has 328 participant organizations from across the world, 50 of which have provided sponsorship funding.
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Given a work, tell me who is responsible for it and describe the nature of that responsibility.
Disambiguation without de-duplication - Modeling authority and trust in the ORCID system http://about.orcid.org/sites/default/files/disambiguation-deduplication_wp_v4.pdf
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
‣ Recommendations / suggested actions from report
[...]
Opportunities for collaboration and interoperability
Service providers should investigate possibilities for authenticating ‘homeless’ users (i.e. freelance researchers with no affiliation, or affiliated researchers at institutions which aren't part of an IDF) via ORCID or other trusted source of author identifiers that may join IDFs in the future.
The IDF community and ORCID should work to harmonize core profile fields/attributes which are likely to hold institution-validated information.
Establish a pilot on federated access management to a biomedical data provider together with EGA, eduGAIN and related national IDFs.
Investigate how an ORCID or other author identifier and its provenance can be modelled as an attribute in IDF and interfederation services, as part of a set of attributes automatically released by the identity provider.
[...]
IRISC2011 workshop @CSC, Helsinki
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Match made in heaven, no?- Opportunities for collaboration -
‣ Pilot ORCID <-> IDF integration in high-value use cases
‣ Starting points - some suggestions
- A) Authenticate via federated identity to central ORCID system
- Users authenticates the first time, registers & his new profile is populated on the fly with orgz-validated information released by IdP
- B) Starting from institution, link ORCID account with inst. user account and pull in ORCID identifier + publication data
- Would need IDF attribute to carry universal, validated author identifier
TNC2012 TERENA Networking Conference, Reykjavik, May 21-24 2012
Where do we go from here?
‣ Get involved - join the discussion
- http://about.orcid.org - Main website, general info
- http://dev.orcid.org - Developer web portal - NEW!!
- Test “sandbox” system (bring your own sand!) http://devsandbox.orcid.org http://api.devsandbox.orcid.org
- Contact me, as (provisionally) co-chair of ORCID’s Technical Outreach Working Group, together with Elsevier’s Mike Taylor
Prof Anthony J. Brookes Bioinformatics Group, Leicester
This work has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013)under grant agreement number 200754 - the GEN2PHEN project.