Top Banner
Mendeley's Data and Perspectives on Data Challenges Kris Jack, PhD Chief Data Scientist https://twitter.com/_krisjack
21

Mendeley's Data and Perspectives on Data Challenges

May 28, 2015

Download

Technology

Kris Jack

Presentation given at the RecSysChallenge workshop (http://2012.recsyschallenge.com/) at Recommender Systems 2012 (http://recsys.acm.org/2012/).
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mendeley's Data and Perspectives on Data Challenges

Mendeley's Data and Perspectives on Data

Challenges

Kris Jack, PhDChief Data Scientist

https://twitter.com/_krisjack

Page 2: Mendeley's Data and Perspectives on Data Challenges

➔ What's Mendeley?

➔ Why Run Challenges?

➔ Mendeley's Challenges

➔ Conclusions

Overview

Page 3: Mendeley's Data and Perspectives on Data Challenges

What's Mendeley?

Page 4: Mendeley's Data and Perspectives on Data Challenges

➔ Mendeley is a platform that connects researchers, research data and apps

➔ How are we building our community?

Mendeley Open API

Page 5: Mendeley's Data and Perspectives on Data Challenges

...organise their research

Mendeley provides tools to help users...

...organise their research

➔ Reference management

➔ Cite-as-you- write

➔ Full-text article search

➔ Digitalised annotations

Page 6: Mendeley's Data and Perspectives on Data Challenges

...organise their research

...collaborate with one another

Mendeley provides tools to help users...

...organise their research

➔ Professional research groups

➔ Social network

➔ Annotation sharing

Page 7: Mendeley's Data and Perspectives on Data Challenges

...organise their research

...collaborate with one another

...discover new research

Mendeley provides tools to help users...

...organise their research

➔ Personalised article recommendations

➔ Related research

➔ Research contact suggestions

Page 8: Mendeley's Data and Perspectives on Data Challenges

Social network (~2M users)

Research catalogue (~50M unique articles)

Research groups (~175K groups)

Personal libraries(~300M articles)

Our community from a data perspective

Page 9: Mendeley's Data and Perspectives on Data Challenges

Why Run Challenges?

Page 10: Mendeley's Data and Perspectives on Data Challenges

Why Run Challenges?

➔ An important part of our mission is to make science more open

Page 11: Mendeley's Data and Perspectives on Data Challenges

Why Run Challenges?

➔ An important part of our mission is to make science more open

“All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].

Page 12: Mendeley's Data and Perspectives on Data Challenges

Why Run Challenges?

➔ An important part of our mission is to make science more open

“All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].

But a lot of the state of knowledge of the human race is sitting in the scientists’ computers, and is currently not shared […] We need to get it unlocked so we can tackle those huge problems.“

Page 13: Mendeley's Data and Perspectives on Data Challenges

Why Run Challenges?

➔ An important part of our mission is to make science more open

“All the time we are very conscious of the huge challenges that human society has now – curing cancer, understanding the brain for Alzheimer‘s [...].

But a lot of the state of knowledge of the human race is sitting in the scientists’ computers, and is currently not shared […] We need to get it unlocked so we can tackle those huge problems.“

➔ We run challenges that aim to open up science

➔ Your skills in information sciences are valuable to us

Page 14: Mendeley's Data and Perspectives on Data Challenges

Mendeley's Challenges

Page 15: Mendeley's Data and Perspectives on Data Challenges

Challenge: Build an application with our data, make science more open.

Results:

PloS/Mendeley's Binary Battle

More details at http://dev.mendeley.com/api-binary-battle/

Page 16: Mendeley's Data and Perspectives on Data Challenges

Challenge: Build off-line system for scientific recommendations with our API and DataTEL data set

Results: Will discuss today How to improve for the future?

ScienceRec Challenge 2012

More details at http://2012.recsyschallenge.com/tracks/sciencerec/

50K users, with at least 20 articles each

Page 17: Mendeley's Data and Perspectives on Data Challenges

Conclusions

Page 18: Mendeley's Data and Perspectives on Data Challenges

Conclusions

➔ Mendeley makes tools that help researchers to:➔ organise their research➔ collaborate with one another➔ discover new research

➔ We are crowdsourcing a wealth of research data➔ We're opening it up to the world➔ And inviting you to participate

Page 19: Mendeley's Data and Perspectives on Data Challenges

We're Hiring!

➔ Data Scientist➔ apply recommender technologies to Mendeley's data

➔ work on improving the quality of Mendeley's research catalogue

➔ starting in first quarter of 2013

➔ 6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7 TEAM project (http://team-project.tugraz.at/)

➔ http://www.mendeley.com/careers/

Page 20: Mendeley's Data and Perspectives on Data Challenges

www.mendeley.com

Page 21: Mendeley's Data and Perspectives on Data Challenges

A Challenge for the Future?

Challenge: Investigate how well algorithms perform in real-world settings

Motivation: Industry repeatedly finds that aggressive A/B testing is required because offline improvements do not necessarily translate to online improvements

Problem: Academia tends not to have accessto large online communities

Solution: Industry runs A/B test withacademic algorithms and reportsresults

What about privacy?Use publicly available dataAnonymise and aggregate results reported

Research groups (~175K groups)