FAST Enterprise Search Digital Marketing Platform Briefing and Overview
Dec 18, 2014
FAST Enterprise SearchDigital Marketing Platform Briefing and Overview
The platformLanguage supportLinguistic capabilities:
Type aheadSpell Check, did you meanFind similar, synonymsBoosting and Blocking, Visual best bets, ranking
Faceted searchArchitectureToshiba Prototype Presentation
FAST Capabilities and Architecture Overview
Secure, scalable and unified access to informationSingle cost-effective platform
Access content securelywith a full-featured content crawler and both index- and query-time trimming of results
Federate queriesusing OpenSearch to quickly access existing search indexes and online information services
Extend your reachusing Business Connectivity Services to access your content sources and business applications
OpenSearch Federation
Indexing Connectors
Content Business Application
s
WebUser Experience
Search Index
Websites and Intranet
File Shares
IBM Lotus Notes
EMC Documentum
SharePoint Server
Exchange Public Folders
AD & LDAP Profiles
Etc …
Can search in any language 84 languages detected to allow language-specific handlingTokenization -> nouns and phrase detectionStemming (“run” -> “running”)Lemmatization (‘good’ ‘better’ )Customizable synonymsPhrase search leverages stop words (“a room with a view”)
Better Language SupportAfrikaans Hausa Pashto, PushtoAlbanian Hebrew PersianArabic Hindi PolishArmenian Hungarian PortugueseAzerbaijani Icelandic Punjabi
Basque IndonesianRhaeto-Romance
Bengali,Bangla Irish RomanianBosnian Italian Russian
Breton JapaneseSami (Northern)
Bulgarian Kannada SerbianCatalan Kazakh SlovakChinese-S Kirghiz SlovenianChinese-T Korean SorbianCroatian, Kurdish SpanishCzech Latin Swahili
DanishLatvian, Lettish Swedish
DutchLetzeburgesch Tagalog
English Lithuanian TamilEstonian Macedonian TeluguFaroese Malay ThaiFinnish Malayalam TurkishFrench Maltese UkrainianGalician Maori UrduGeorgian Marathi UzbekGerman Mongolian VietnameseGreek Norwegian WelshGreenlandic Norwegian-B YiddishGujarati Norwegian-N Zulu
Advanced LinguisticsType ahead
Advanced LinguisticsType ahead
Advanced LinguisticsType ahead
Advanced LinguisticsSpell Check and Did you mean
Advanced LinguisticsSpell Check and Did you mean
Advanced LinguisticsFind similar documents & Federated search
Find similar documents
Federated results
Advanced LinguisticsSearch Analytics (Synonyms, Boost and Block & best bets)
Advanced LinguisticsSynonyms
Highlight specific content
Control Search resultsBest bets
Control Search resultsBest bets + User context
Banner for camera novices
Banner for camera
enthusiasts
Didier is a “camera novices” Cari is “camera enthusiasts”
Consumer oriented cameras
Advanced camerasAudience based search results
Boost and BlockControl ranking of any search results
Boost or block any
search result…
Control the
ranking based on
users
Rank ProfilesTune relevancy without impacting the default algorithm
Quality Also known as static rank, consists of multiple managed properties including site, URL depth (preference for shorter URLs), and relative importance of links to this document.
Authority Applies when the query word falls in the link or anchor text.
Query Authority
Maps the popularity of a document, or the click-through rate when documents are clicked as a result of a query
Freshness Increases the relevancy if a document was recently created or modified, based on the last modified property.
Proximity Applies to where query terms fall and how close they are to each other within a document
Context Increases the rank of a document if the query term is a managed property associated with that document
Managed Property
Effects relevancy when a managed property contains a specific value, such as Woodgrove Bank or Financial Services
Out of the box relevancyTuned for great general productivity experience, relevancy improves with click-throughs and link text analysis.
Extend the default algorithmsCreate new default relevancy models. Blend static and dynamic ranking parameters to instantly improve search results.
How to create a Rank ProfileIT Pros are empowered to create new profiles quickly
Rank Profiles created in PowerShell by extending the default relevancy algorithm…
… and are exposed in the user interface by modifying the sorting
web part.
Faceted NavigationEnables search-driven navigation that is relevant to your business
Add needed metadatawith pre-built extractors that automatically tag people, locations, and company names
Extend easilyto recognize business-specific terms and concepts—tailoring search for your information
Surface in navigationrecognized properties, making search results more relevant and discoverable
Powerful Entity ExtractionEnables search-driven navigation that is relevant to your business
CONCEPT
PRODUCT
COMPANY
Faceted NavigationEnables search-driven navigation that is relevant to your business
Crawled propertiesDoctitleProductNameManufacturerName….
Managed propertiesTitle
Manufacturer….
The platformLanguage supportLinguistic capabilities:
Type aheadSpell Check, did you meanFind similar, synonymsBoosting and Blocking, Visual best bets, ranking
Faceted searchArchitectureToshiba Prototype Presentation
FAST Capabilities and Architecture Overview
SharePoint and FAST SearchSummary of architectural elements
Custom Front-End
OpenSearch or Other Sources
SharePoint Front-end
People Search
Qu
ery
Obje
ct M
od
el
Query and Result
Processing
Security AccessModule
SearchCore
Indexing
Federation Object Model
Query Web Service
Content Processing
Connectors
• Web Crawler
• JDBC
Connectors
• SharePoint• File
Traverser• Web • BDC• Exchange• Notes• Documentu
m
Microsoft System Center Operations Manager
Monitoring Services
Administration and Schema Object Model
Site Collection Level Admin UI
• Keyword Management• User Context
Management• Site
Promotion/Demotion
PowerShell
• Schema configuration• Admin configuration• Deployment
configuration
Central Administration UI
• Property mapping• Property extraction• Spell-checking
FAST Server(s)
SharePoint Server(s)
Other Server(s)
Content
Web LinkAnalysis
Extensible Content ProcessingEnables search that has a deep understanding of your information
Transform contentusing a processing pipeline that normalizes and cleanses all of your information
Use globallywith linguistics processing for 45 languages and recognition of content in 80+ languages
Add or extend stagesthat apply sentiment analysis, translation, or other business-specific processing you need
…
Format Converter
LanguageDetector
Lemmatizer
Word Breaker
Entity Extractor
Date/TimeNormalizer
Vectorizer
Web Analyzer
Properties Mapper
FAST Search for SharePoint Scaleout
Content Volume
Query Volume
Scale-out multiple “dimensions”
Content VolumeQuery VolumeIndexing freshness
Redundancy options
SearchIndexing
Performance targets*
15M Docs/column30 QPS/row
* Dependent on document and HW charateristics
Search and Indexing
Crawling and Content
Processing
Query and Result
Processing
Back-end with extreme and flexible scale out options
No theoretical upper bounds!
FAST – Small deployment Low volume with HA
FRONT-END WEB TIER
SERVICE APPLICATION TIER
DATABASE TIERDatabase
Crawler db
Fast AdminWeb analyzercrawl
Content distributorItem processingIndexing-dispatcherQuery processingSearch / Index
Web ServicesQuery SSAContent SSA
Content distributorItem processingIndexing-dispatcherQuery processingSearch / Index
Web ServicesQuery SSAContent SSA
Add visual navigationthat makes it easier for people to understand and gain insight from search results
Use familiar toolslike SharePoint Designer and Visual Studio that integrate with SharePoint and FAST Search
Build new kinds of appsthat allow people to work with disparate information utilizing common APIs for search
Customizable Search ExperienceEnables building applications that increase user efficiency
Query Language Expressiveness
FAST Query Language:
(documents containing either Mary, Diane or both gets a boost of 5000):
Xrank(string(”performance”), or(person:string(”diane tibbott”), person:string(”mary baker”)), boost=5000))
Operator Type Keywords
Boolean AND, ANDNOT, OR, ANY, NOT, COUNT
Proximity NEAR, ONENEAR
Numeric FLOAT, INT, DATETIME
String WEIGHT, WILDCARD, MODE
Boundary Match START-WITH, ENDS-WITH, EQUAL
Ranking RANK, XRANK
The platformLanguage supportLinguistic capabilities:
Type aheadSpell Check, did you meanFind similar, synonymsBoosting and Blocking, Visual best bets, ranking
Faceted searchArchitectureToshiba Prototype Presentation
FAST Capabilities and Architecture Overview
Evolution of Search
An evolution of search
Keywords
Keywords
NavigationNav
Featured Content
Featured Content
Featured
Content
Recommendations
ItemRecs
Insight
Refinement
Segment
Segment Segment
Segment Segment
Segment
Segment
Segment