Results Introducon Heuriscs Big data sciensts, like travelers to a new land, are faced with the daunng task of discovering which (storage) locaons contain in- teresng aracons (i.e., research data) Many services, such as travel websites, pro- vide user-specific recommendaons derived from analysis of huge amounts of usage data We explore how recommendaon approach- es can be adapted and applied to big data sci- ence. In parcular, we create heuriscs for recommending Globus data locaons Globus Acknowledgements Transfer accuracy: the av- erage number of end- points correctly predicted The neural network, which combines heuriscs, outperforms all indi- vidual heuriscs The most unique users, instuon, and owned endpoints heuriscs perform poorly except in cases where users have lile or no transfer history William Agnew Georgia Instute of Technology Kyle Chard (Advisor) University of Chicago Ian Foster (Advisor) University of Chicago Globus [1] network. Each endpoint is a vertex, larger if endpoint is more popular. Edges between endpoints that have transferred, more visible is transfer between pair is more frequent. Long-tailed Usage Distribuons Bytes Transferred per User Transfers per User Unique Endpoints per User History: The most likely source (S) / desnaon (D) endpoint is the most recent S/D endpoint used by a user Markov Chain: A transion matrix of the observed probabilies of using each endpoint as a S/D condioned on a parcular end- point being previously used as a S/D Most Unique Users: The most likely S/D endpoint is the S/D endpoint with the most unique users Instuon: The most likely S/D endpoint is the most popular endpoint at that user’s instuon Endpoint Ownership: The most likely S/D endpoint is the end- point most recently created by the user Heuriscs perform well for different classes of users We use a deep recurrent neural network [2] to combine heuriscs by ranking the predicons of each heurisc for the series of user endpoint choices Neural Network Block. Takes as input heurisc endpoint recommendaons and memory of past recommendaons to user and outputs reweighted heu- risc endpoint recommendaons and updated recommendaon memory User accuracy: the aver- age accuracy per user, where a user's accuracy is the fracon of that user's endpoints correctly pre- dicted References 1. Foster, Ian. "Globus Online: Accelerang and democrazing science through cloud-based services." IEEE Internet Compung 15.3 (2011): 70. 2. Graves, Alex. "Generang sequences with recurrent neural networks." arXiv pre- print arXiv:1308.0850 (2013). This work is supported in part by the Naonal Science Foundaon grant NSF- 1461260 (BigDataX REU) Deep Recurrent Neural Networks Recommendaon Mockup Touring Dataland? Automated Recommendaons for the Big Data Traveler Michael Fischer University of Wisconsin-Milwaukee Website wagnew3.github.io