Jan 04, 2016
What’s the Big Deal About R?
Tom Tiedeman, OCIOJuly 21, 2015
Typical Patent /Trademark Questions
• Is my idea actually new?• How much innovation comes from our state
university? Has state support paid off?• How can I easily track new patent grants
and applications in my interest area?• How can actual use of the newest
technology be increased?
Making sense of big data
• Diverse user interests• Interest in particular needle not the haystack• Inference / judgment are key• Continued monitoring for new developments• Possible huge economic impacts – or not :o)• Total volume of complex questions could be
extreme• Several data sets needed for an answer
USPTO’s Challenge: What Good is Open Data if People Can’t Use It?
• Terabytes of data• Fast-changing (~ 30 – 50 GB per
week)• Complicated data structure (XML /
relational)• Fuzzy information (images, non-
standard text)
USPTO Data and Existing Tools
• USPTO web downloads very constrained• Page scraping is insufficient• XML is not just rows and columns• Formats like PDF are non-trivial• Data scale is much too large for tools like
Excel and Access• What USPTO provides / does will change
Why “R”?: Loose fit for a wide range of problems
• Statistical / graphical computing focus• Free PC-based open source software• Links with other languages• Growing power, application, user base• Online download capability• Tools for XML, API’s, JSON, other data formats• 6,900 packages, plus framework for more• Many training courses, academic base• Just Google “R”
Learn “R”• MOOCS and courses to learn R
http://www.r-bloggers.com/moocs-and-courses-to-learn-r /
• EdX.org: Explore Statistics with R https://www.edx.org/course/explore-statistics-r-kix-kiexplorx-0 /
• Coursera.org: Data Science Specializationhttps://www.coursera.org/specialization/jhudatascience/1