2003.04.02 - SLIDE 1 IS246 - SPRING 2003 Lecture 18: Final Project Overview IS246 Multimedia Information (FILM 240, Section 4) Prof. Marc Davis UC Berkeley SIMS Monday and Wednesday 2:00 pm – 3:30 pm Spring 2003 http://www.sims.berkeley.edu/academics/ courses/is246/s03/
25
Embed
2003.04.02 - SLIDE 1IS246 - SPRING 2003 Lecture 18: Final Project Overview IS246 Multimedia Information (FILM 240, Section 4) Prof. Marc Davis UC Berkeley.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Today people cannot easily find, edit, share, and reuse media
• Computers don’t understand media content– Media is opaque and data rich– We lack structured representations
• Without content representation (metadata), manipulating digital media will remain like word-processing with bitmaps
2003.04.02 - SLIDE 5IS246 - SPRING 2003
Desiderata for Media Metadata
• At least, Pat should be able to use Pat’s metadata
• Better, Chris should be able to use Pat’s metadata
• Even better, Chris’s computer should be able to use Pat’s metadata
• At best, Chris, Pat, and their computers should be able to use the metadata they all produce
2003.04.02 - SLIDE 6IS246 - SPRING 2003
Representing Video
• Streams vs. Clips– Stream-based annotation makes annotation pay off
• The richer the annotation, the more numerous the possible segmentations of the video stream
• Clips change from being fixed segmentations of the video stream, to being the results of retrieval queries based on annotations of the video stream
• Annotations create representations that make clips, not representations of clips
• Video syntax and semantics– The Kuleshov Effect– Video has a dual semantics
• Sequence-independent invariant semantics of shots• Sequence-dependent variable semantics of shots
• Ontological issues in video representation– Video plays with rules for identity and continuity
• Space• Time• Person• Action
2003.04.02 - SLIDE 7IS246 - SPRING 2003
The Search for Solutions
• Current approaches to creating metadata don’t work– Signal-based analysis– Keywords– Natural language
• Need standardized metadata framework– Designed for video and rich media data– Human and machine readable and writable– Standardized and scaleable– Integrated into media capture, archiving, editing,
distribution, and reuse
2003.04.02 - SLIDE 8IS246 - SPRING 2003
Signal-Based Parsing
• Effective and useful automatic parsing
– Video• Scene break detection
• Camera motion analysis
• Low level visual similarity
• Feature tracking
– Audio• Pause detection
• Audio pattern matching
• Simple speech recognition
• Approaches to automated parsing
– At the point of capture, integrate the recording device, the environment, and agents in the environment into an interactive system
– After capture, use “human-in-the-loop” algorithms to leverage human and machine intelligence
2003.04.02 - SLIDE 9IS246 - SPRING 2003
Why Keywords Don’t Work
• Are not a semantic representation
• Do not describe relations between descriptors
• Do not describe temporal structure
• Do not converge
• Do not scale
2003.04.02 - SLIDE 10IS246 - SPRING 2003
Natural Language vs. Visual Language
Jack, an adult male police officer, while walking to the left, starts waving with his left arm, and then has a puzzled look on his face as he turns his head to the right; he then drops his facial expression and stops turning his head, immediately looks up, and then stops looking up after he stops waving but before he stops walking.
2003.04.02 - SLIDE 11IS246 - SPRING 2003
Notation for Time-Based Media: Music
2003.04.02 - SLIDE 12IS246 - SPRING 2003
Visual Language Advantages
• A language designed as an accurate and readable representation of time-based media– For video, especially important for actions,
expressions, and spatial relations
• Enables Gestalt view and quick recognition of descriptors due to designed visual similarities
• Supports global use of annotations
2003.04.02 - SLIDE 13IS246 - SPRING 2003
After Capture: Media Streams
2003.04.02 - SLIDE 14IS246 - SPRING 2003
Media Streams Features
• Key features– Stream-based representation (better segmentation)– Semantic indexing (what things are similar to)– Relational indexing (who is doing what to whom)– Temporal indexing (when things happen)– Iconic interface (designed visual language)– Universal annotation (standardized markup schema)
• Key benefits– More accurate annotation and retrieval– Global usability and standardization– Reuse of rich media according to content and structure
2003.04.02 - SLIDE 15IS246 - SPRING 2003
Today’s Agenda
• Review of Last Time
• Final Project
– Final Project Overview
– Final Project Ideation
– Final Project Team Building
• Action Items for Next Time
2003.04.02 - SLIDE 16IS246 - SPRING 2003
Final Project Overview
• Project goals– Opportunity to integrate, apply, and demonstrate your
understanding of the theories and lessons learned in the class sessions and previous assignments
– Make a useful contribution to some aspect of our work within and understanding of multimedia information systems
• Project design– You will choose
• Size and composition of your project group• Topic you investigate• Medium of its exploration and presentation
2003.04.02 - SLIDE 17IS246 - SPRING 2003
Final Project Overview
• Project size– Paper: 1-2 people – Interactive Low-Fi Prototype: 2-3 people – Non-Interactive Video Low-Fi Prototype: 3-4 people– Interactive Hi-Fi Prototype: 3-5 people
• Project medium– Writing a detailed paper– Designing a low-fi prototype– Designing a hi-fi prototype of a system module
• Process– Through an iterative process of ideation proposal, specification,
implementation, and presentation, you will get feedback on you final project throughout every stage of its development
• Questions– What problem are we trying to solve?– To whom does this solution matter? Why?– What do we expect to learn from this project?
2003.04.02 - SLIDE 18IS246 - SPRING 2003
Final Project Schedule
• Week 11– Wed 04/02/2003 Milestone 1 – Final Project Team / Idea Formation
assigned• Week 12
– Mon 04/07/2003 Milestone 1 due – Mon 04/07/2003 Milestone 2 – Final Project Proposal assigned
• Production– Continuity systems– Directing– Cinematography– Production information tracking
• Postproduction– Editing– Special effects– Sound design
• Distribution– Customization/Personalization (based on location, person, platform,
device, context)
2003.04.02 - SLIDE 23IS246 - SPRING 2003
Today’s Agenda
• Review of Last Time
• Final Project
– Final Project Overview
– Final Project Ideation
– Final Project Team Building
• Action Items for Next Time
2003.04.02 - SLIDE 24IS246 - SPRING 2003
Today’s Agenda
• Review of Last Time
• Final Project
– Final Project Overview
– Final Project Ideation
– Final Project Team Building
• Action Items for Next Time
2003.04.02 - SLIDE 25IS246 - SPRING 2003
Readings for Next Week
• Monday 04/07 “Media Asset Management and Reuse Process”– M. Christel, S. Stevens, T. Kanade, M. Mauldin, R. Reddy, and H.
Wactlar, "Techniques For The Creation And Exploration Of Digital Video Libraries," in Multimedia Tools and Applications, vol. 2, B. Furht, Ed. Boston: Kluwer Academic Publishers, 1996; pp. 1-33.
– N. Dimitrova, H.-J. Zhang, B. Shahraray, I. Sezan, T. Huang, and A. Zakhor, "Applications of Video Content Analysis and Retrieval," IEEE MultiMedia, vol. 9, 2002; pp. 42-55.
– Prelinger, R. ARCHIVAL SURVIVAL: The Fundamentals of Using Film Archives and Stock Footage Libraries. The Independent Film & Video Monthly (October); pp. 1-4.
– Jenkins, H. Textual Poachers: Television Fans & Participatory Culture. Routledge, New York, 1992; pp. 223-249.
• Wednesday 04/09 Guest Lecture: Paul Grabowicz on “Multimedia Industry Overview and Prospects”– Rich Gordon, Associate Professor of New Media at Northwestern's
Medill School of Journalism, The Meanings and Implications of Convergence; pp. 12-13. (http://www.medill.northwestern.edu/alumni/medillian/fallwinter02/meaningsofconvergence.pdf)