Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents Petra Galuščáková and Pavel Pecina [email protected]Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics Charles University in Prague 4. 4. 2014
29
Embed
Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents
Institute of Formal and Applied LinguisticsFaculty of Mathematics and Physics
Charles University in Prague
4. 4. 2014
2
Information Retrieval
● Information Retrieval (IR) is a task which involves searching for documents relevant to a given query.
3
Speech Retrieval
● Speech Retrieval focuses on retrieval from audio-visual documents (recordings).
4
Speech Retrieval
● Speech Retrieval is often converted on traditional Information Retrieval
● Automatic Speech Recognition (ASR) system applied to the audio track
5
Speech RetrievalProblems
● Documents are long (e.g. whole TV programmes)● Often unstructured
● Navigation in audio-visual recordings is time consuming● We need to retrieve relevant segments of full documents
● Possibility to browse the recordings using hyperlinks (links between passages)
→ Passage Retrieval
6
Passage Retrieval● Splits texts into smaller units which then function as
documents in the retrieval process
● Makes the retrieval process more precise
● May improve retrieval of full documents● The segmentation is crucial for the quality of the retrieval
→ We focus on segmentation strategies
7
Segmentation Strategies
● Regular (Window-based)● Segments of equal length with regular shift● Claimed to be a very effective approach
● Similarity-based● Measures similarity between neighbouring segments
● Lexical-chain-based● Finds sequences of lexicographically related word occurrences
● Feature-based● Employs machine learning methods to detect segment boundaries
based on various features
8
Feature-based Segmentation in Passage Retrieval
9
ExperimentsTasks Description
10
● MediaEval is a benchmarking initiative dedicated to development, comparison, and improvement of strategies for processing and retrieving multimedia content.
● E.g., speech recognition, multimedia content analysis, music and audio analysis, social networks, geo-coordinates, …
● 2013 Similar Segments in Social Speech Task● 2013 Search and Hyperlinking Task
11
Similar Segments in Social Speech (SSSS) Task
● Scenario:● A new member (e.g., a new student) joins a community or
organization (e.g., a university), which owns an archive of recorded conversations among its members
● A member wants to find information according to his or her interest in the archive
– The student wants to find more segments similar to the ones he or she is interested in and browses the archive using hyperlinks in videos
● The main goal:● To find segments similar to the given ones
12
Similar Segments in Social Speech Task Data
● On purpose recorded interviews (5 hours) of two speakers (university students’ community)
● Letter cases● Length of the silence before the word● Division given in transcripts (e.g., speech segments defined in
the LIMSI transcripts)● The output of the TextTiling algorithm
21
Feature-based Segmentation Approaches
22
ExperimentsResults
23
Similar Segments in Social Speech Task - Evaluation
● Best results are obtained by the feature-based segmentation into overlapping segments
● Manual gold-standard segmentation is outperformed by feature-based segmentation (MRRw score on the manual transcripts)
● Manual transcripts are significantly better in all scores
24
Segmentation Model in the SH Task
● Training set used in the SH Search Subtask is very small● We apply the SSSS-trained models in the SH task● Allows us to examine the possibility of creating a universal
model for feature-based segmentation● Potential problems:
● Different vocabulary (student's dialogues vs. TV programmes)● Different ASR systems may prefer different vocabulary● Different distribution of silence, document structure
25
SH Task Evaluation
● Not as consistent as for the SSSS task
● Depending on the type of the transcript
● Feature-based approaches creating overlapping segments - effective when applied on the subtitles
26
Conclusion
27
Conclusion
● Information Retrieval, focus on speech data (Speech Retrieval)
● Focus on retrieval of exact relevant passages
● Importance of segmentation● Experiments in MediaEval benchamark
● Similar Segments in Social Speech Task (university student dialogues) and Search and Hyperlinking Task (BBC programmes)
● We applied window-based segmentation and three types of feature-based segmentations
28
Conclusion cont.
● Feature-based segmentation applied in the two tasksoutperformed regular segmentation● Claimed to be a very effective approach● The improvement in the SSSS Task was statistically
significant on the manual (MRRw and mGAP measures) and ASR (mGAP measure) transcripts
● The results in the SH task were not so conclusive● Some of the results (on the subtitles) are encouraging
29
Thank you
This research has been supported by the project AMALACH (grant n. DF12P01OVV022 of the program NAKI of the Ministry of Culture of the Czech
Republic), the Czech Science Foundation (grant n. P103/12/G084), and the Charles University Grant Agency (grant n. 920913).