Search and the ‘Net @ 2015 Michael Hunter Reference Librarian Hobart and William Smith Colleges For Rochester Regional Library Council Member Libraries’ Staff Sponsored by the Rochester Regional Library Council Supported by Regional Bibliographic Databases and Resources Sharing (RBDB) funds granted by the New York State Library 2015
71
Embed
Search and the ‘Net @ 2015 · Search and the ‘Net @ 2015 Michael Hunter Reference Librarian Hobart and William Smith Colleges For Rochester Regional Library Council Member Libraries’
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Search and the ‘Net @ 2015
Michael Hunter Reference Librarian
Hobart and William Smith Colleges
For Rochester Regional Library Council Member Libraries’ Staff
Sponsored by the Rochester Regional Library Council
Supported by Regional Bibliographic Databases and Resources Sharing (RBDB) funds granted by the
New York State Library 2015
For today . . .
The Searchscape
Current Evolution of Search
New Services
The Social Web and Research
Data Visualization
Bing, Yahoo and DuckDuckGo
Google
Linklist
http://people.hws.edu/hunter/searchnet15links.htm
USC Annenberg’s Digital Future Report 2014 http://www.digitalcenter.org/wp-content/uploads/2014/12/2014-Digital-Future-Report.pdf "General Internet Activities"
E-Reading Rises as Device Ownership Jumps By Kathryn Zickuhr and Lee Rainie http://www.pewinternet.org/2014/01/16/e-reading-rises-as-device-ownership-jumps
American adults 18+ - % who
read at least 1 book in that year
American adults 18+ - % who
own each device
New Top Level Domains
First made available 1/29/14
Over 150 now live on donuts.co (2/15/15)
Content-significant
.bike, .energy, .delivery, .legal, .guru
Brand-specific – “vanity domains”
.android, .walmart, .nyc
Allow for non-roman scripts –Arabic, Chinese etc.
Require proof of identity/relationship to TLD
Unique TLD costs $185,000
Growth of Query Types over 1 year http://searchenginewatch.com/sew/how-to/2383498/how-will-voice-search-impact-a-search-marketers-world
Google’s Knowledge Graph – rooted in the (human) community-created entities in Freebase
Crowdsourcing too slow; often ignores specialized areas of knowledge, non-English content
Knowledge Vault – Automated extraction of raw data and creation of entities derived from that data
DOM trees-structures that help browsers represent and interact with
documents in html and other formats (Wikipedia)
More Semantic Processing…
Term Frequency Data
Frequency, proximity, order
Aids in discovery across subject areas, filetypes and entire domains
Pattern Matching Algorithms
Focuses on recognition of patterns and regularities in text, data and images
Structured Data
Structured Web tables and data sets
(.xls, .kml, .sdf)
Human created tags – Schema.org
Schema.org
Organization backed by G, B, Y and other engines to standardize metadata for use by crawler-based services.
Helps create "real" answers and "rich snippets"
Schema example: Restaurant with a menu
Predictive Operations: Inferring the user’s intent
“The Holy Grail of Search”
Location-based results – IP and GPS
Weather, entertainment, restaurants…..
Anonymous past searches and user behavior
Personal data volunteered by user
Time of day
Device used
Semantic Predictive Processing Operations
--Correctly interpret the query, or a portion of the query --Give a “best guess” answer based on highly trusted sources (knowledgebase) and similar searches --Aggregate and grow the knowledgebase through iterative, real-time web crawls
Discovery Apps: Personalized Search on Steroids
Combines your
Personal preferences
Location
Demographic characteristics
Social network data
People, Preferences, Interests, Events
Suggests entertainment, restaurants and more
Chat with your social network friends
“Current events you may like within X miles”
Gravy – Free on I Tunes
Personal Assistant Apps
Connects to your
E-mail
Calendar
Facebook events
Prompts for transportation times, quickest routes
Includes some discovery and chat features
Relies heavily on user-supplied personal data
Sunrise, Tempo, et. al.
Apps and the Deep Web
Currently crawler-based S.E.’s cannot access content in apps
Posts
Links
Personal data
Education apps continue to grow in content, quality and use
Google is working on indexing them…..
New Services
Izik.com
Search app by Blekko search engine
Launched as a tablet app in 2013
Now accessible via desktop/laptop/smartphone
Searches Web, Twitter, News sources
Dynamically clustered results
Focuses on popular culture, shopping, news
Individual results can be shared via social networks
Qwant A fresh approach to search
Aims to offer a European-based service that respects user’s privacy
Launched in France in 2013
Search verticals offered:
Web Media (News) People (Social Networks)
Boards (Online Forums, mostly European)
Results clusters offer “refine search”
Web News Social Shopping
Qnowledge Graph (from Wikipedia and other general sources)
16 interface languages, which influence search results
Binpad Hierarchical results clustering
Search options: Web, Wiki, Pubmed
Results clusters include other closely related items, often in hierarchical order
All results include images
Verticals include News, Edu, TV, Editor
Editor still in development
Project of Xdroid, Inc., an enterprise search software developer in Hungary
CC Search search.creativecommons.org/
Searches media in the public domain
Flickr, YouTube, Jamendo, Wikimedia Commons, SoundCloud and others…..
Some sponsored results appear that are not in the public domain
Verify use conditions for each result
Search and the Dark Web
Dark Web- Networks with server addresses intentionally obscured
Often house online criminal activities
Includes TOR Networks Hidden Services
have .onion TLD
Only accessible via TOR’s private browser
Content not PW protected, but not accessible to crawler-based services due to lack of linkage
Memex DOD’s Dark Web Search Engine
Software to visualize and organize big data
Searches text, handwritten text, images, geographic data embedded in photos….
Identifies hidden relationships among websites, deep web sites and forums
Can provide very latest top news, tips and cutting-edge research in a topic or interest
Slowly gaining popularity-require set-up time and maintenance
Locating lists using hashtags
topic or associated element
#tax #IRS
person, place, event associated with topic
#olympians #worldcup
“101 best twitter lists to follow” http://www.postplanner.com/101-best-twitter-lists-to-follow/
music.twitter.com twitter’s own music location service
Education and the social searchscape
Offers first-hand accounts of events and conditions
Informative of current world cultures and trends on a wide range of subjects
Gateway to blogs and other online communication that can enhance scholarship
Channel for updates to educational programs
Embedded links and other information often highly relevant and recent
Requires careful evaluation of information found there
Data Visualization
Enables patterns to emerge in big data
More accessible to visual learners
Facilitates sharing across languages
Can be made compatible with a wide range of data formats
Responsive to real-time changes
Showcase of 2014 projects:
http://flowingdata.com/2014/12/19
Bing, Yahoo and DuckDuckGo
Looking for a niche
Bing and Yahoo represent 29% of all US searches http://comscore.com 12/1/14
Yahoo
Focus is on local and personalized search results
Now partnered with Yelp, local business search engine
Bing
Focus is on lifestyle, travel, images, maps
Social search results (FB, TW) in a sidebar
Bing Image Search
High quality images
Related search offered, based on descriptive text associated with the image
Clustering by topic
Filters
Size People
Color Date
Type License
Layout SafeSearch
Image Match with a URL or image you upload
Entity Comparisons
Google Bing
Bing for Schools http://www.bing.com/classroom
Safe search filters and ad-free environment
Requires registration by a school
Not possible to access it for home use
Daily lesson plan available based on the image used each day on the Bing homepage
Excludes Bing apps
DuckDuckGo http://ddg.gg
Offers anonymous search functionality
Popularity spiked after NSA PRISM search engine scandal
Does not save search history of any type
G. does, using it "to increase relevancy"
Included as a search option in Apple's latest version of Safari
Has been blocked in China !!!
Google
Knowledge Vault Beyond the Graph…..
Knowledge Graph seeded from Freebase entities and human additions
Automated generation of entities increases number and discovers hidden relationships among entities and their attributes
Entities now appear at top of results page with related topics or other relevant information
Type of additional information varies depending on entity
Graph database stores data in nodes and relationships.
http://www.oaddo.org/home
Right to be Forgotten ruling EU's European Court of Justice, May 2014
G. and other search engines must remove results deemed to be "inadequate, irrelevant or no longer relevant, or excessive in relation to the purposes for which they were processed and in the light of the time that has elapsed."
"How to write the In Depth Articles that Google Loves" copyblogger.com
Content farm orientation?
Requires careful evaluation of each item; unvetted websites in particular
Google's tech projects
Google for Kids - under 13; more parental controls
Project Loon - Provide Web access via solar-powered drones
Self-driving cars
Google Glass 2
Smart contact lenses
Continuous health monitoring via disease-detecting nanoparticles
Liftware - stabilized spoon for tremor sufferers
"Google Tracker 2015" http://arstechnica.com
A bit of historical perspective: Top 5 http://www.washingtonpost.com/news/the-intersect/wp/2014/12/15/from-lycos-to-ask-jeeves-to-facebook-tracking-the-20-most-popular-web-sites-every-year-since-1996/
Search in the Future
Will continue to be more specialized
Shopping - Amazon Travel - Kayak
Movies - IMDB Real-time news - TW
Discovery software will integrate more diverse types of data, crowdsourced to expert
Data overload will continue
Social web will increase as a tool for social change
Search engines will be challenged by governments worldwide in the areas of commercial monopoly and individual privacy