Top Banner
Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005
51

Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

Multimedia SearchBradley Horowitz

Head of Technology DevelopmentYahoo! Search and Marketplace

November, 2005

Page 2: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

2

Yahoo! Search Vision

Enable people to find, use, share and

expand all human knowledge

• Find: Enable people to find what they are looking for

• Use: Search not for sake of searching, but to achieve a purpose

• Share: Sharing knowledge with people you connect with and connecting to people who you share knowledge with

Expand

• Find: Enable people to find what they are looking for

• Use: Search not for sake of searching, but to achieve a purpose

• Share: Sharing knowledge with people you connect with and connecting to people who you share knowledge with

Page 3: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

3

FUSE

fuse (fyooz)verb fused, also fus•ing, fus•es

To become mixed or united by melting together

fusion (fyoo zhųn)noun

A reaction in which nuclei combine to form massive nuclei with the simultaneous release of energy

Knowledge Fusion: Enable people to find, use, share and expand all human knowledge

Page 4: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 5: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 6: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 7: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 8: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 9: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 10: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

10

Page 11: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 12: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 13: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.
Page 14: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

14

How it works…

Standard Crawl / Index / Serve paradigm

• Automated crawlers spider the web and encounter multimedia content. (Feeds augment the crawl…)

• Metadata is generated and indexed, content is fetched and “summary” (thumbnail) is extracted and stored on Yahoo servers

• Queries are processed and results are served

Page 15: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

15

Challenges: relevance & ranking

• What is the “best” match, i.e. the best image for a given query?

• Techniques like link-flux or PageRank not as meaningful in multimedia context

• Clickstream analysis a useful tool– Unlike web results, users can consume content directly on

SRP…– Therefore clicks are meaningful “votes”

Page 16: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

16

Challenges (v. Web Search)

• Crawling Multimedia data can be difficult• URLS are often dynamically generated, often

intentionally obsfucated.• Huge storage implications (thumbnail cache)• Multimedia objects are binary, and hence

“opaque” v. self-describing web pages, thus dependence on heuristic means of deriving metadata

• Relevance is difficult to determine

Page 17: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

17

Metadata Extraction

Any and all means….

• Syntactic stuff: Size, dimensions, format, duration, frames per second, etc.• Object name: “dog.gif”• ALT tags, <… ALT=“Picture of a Dog”>• Embedded metadata (headers): ID3 tags, EXIF information, etc.• Analysis:

– Color / black & white– Acoustic Fingerprinting– Speech Recognition– Speaker Identification– OCR– Face Recognition– Object Recognition– Indoor / outdoor– Music / Speech

Page 18: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

18

Problems with automated Metadata Extraction

• Techniques are computationally expensive, very difficult to compute at web scale

• Results are noisy, prone to error…

• Some of these drawbacks can be mitigated by using context

• Biggest problem is that techniques do not extract most valuable level of information

Page 19: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

19

A long digression on metadata

Page 20: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

20

ESP Game

Page 21: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

21

ESP Game

Page 22: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

22

ESP Game

Page 23: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

23

Great fun!

Page 24: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

24

ESP Game

Page 25: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

25

Over 10m photos tagged…

Page 26: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

FUSE Case Study: Flickr

Page 27: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

27

What’s important about Flickr?

“An online photo sharing community…”

•Filled with high-quality, timely, topical photographic content...

•Richly annotated and indexed, searchable, browsable and navigable…

•Tens of thousands of distribution partners…

•Hundreds of applications written against flickr platform…

content entirely user-generated.

metadata entirely user-generated.

each “deal” brokered by flickr users…

by 1000s of flickr developers

Page 28: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

28

Flickr

Page 29: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

29

Photos from my Contacts

Page 30: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

30

Photos from Nathan

Page 31: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

31

Photos from Nathan via RSS

Page 32: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

32

Photos from Nathan via RSS

Page 33: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

33

Tagging

Page 34: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

34

Tagging / cats

Page 35: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

35

Tagging / interestingness

Page 36: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

36

Tagging / clusters

Page 37: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

37

Flickr Flash Widget

Page 38: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

38

Inline on a blog…

Page 39: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

39

Flickr publishing

Page 40: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

40

Flickr as a service…

Page 41: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

41

Culture of Participation

1 creators

10 synthesizers

100 consumers

Page 42: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

42

Mass MediaMicro Media

My Media

Page 43: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

43

Mass Media as artifact…

• Negroponte: “Being Digital” - bits v. atoms• Current system of media production & consumption

is largely an artifact of “atom-based” distribution• Media has been relatively difficult for individuals to

create, and virtually impossible for individuals to distribute at scale

• Led to a high-stakes economic model

• Mass media has flourished, in part, because micro-media wasn’t viable…

Page 44: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

44

The Tale of the Head & the Tail

Popular content …everything else

Postulates

•Area under the curve of “long tail” is significant…

•Economics around this “tail” content often invert

Page 45: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

45

Mass v. Micro Media

Mass Media

• Appeals to Large Audiences• Controlled and manicured

distribution• Expensive to produce• Studio model, high-stakes

economics

Micro Media

• May have limited audience• Uncontrolled, unmoderated

distribution• Cheap to produce• Different, developing

economic model

My Media

Ability to consume from both head and tail

Ability to create and share “my” content

Page 46: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

46

Content Acquisition Strategy

Explicit Feed Relationships

Comprehensive Media Crawl

Media RSS

Page 47: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

47

Media RSS 1.0

• Simple extension to wildly popular RSS format

• Builds on “podcasting” movement, and extends to other media data types

• Designed for “grass-roots” publishing, enabling audiences for content previously unavailable through traditional channels

• Working with community “to get something done,” in partnership and collaboration

Page 48: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

48

Mass Media vs.Mass Media + Micro Media = My Media

Page 49: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

49

Numa Numa

Page 50: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

50

Yahoo Video Search

• Straightforward video search a la Image Search• Yahoo is extremely sensitive to rights holders

– 24 hour take-down policy– Working in partnership and collaboration with studios and

networks to serve their needs– Leveraging existing business models and objectives – we drive

traffic to your property– Great tool for monitoring and dealing with infringing content

• Extremely successful, more innovation in the works• Video Search is not a technology problem per se, but

a “business problem” Yahoo is well-poised to address.

Page 51: Multimedia Search Bradley Horowitz Head of Technology Development Yahoo! Search and Marketplace November, 2005.

51

Thanks!

[email protected]