Story Compression: Aggregating News Feeds Joseph W. Barker Advisor: James W. Davis Ohio State University What is Story Compression? • News broadcasts from multiple sources tend to cover same stories • Stories have content overlap – General content covered by multiple sources – Specific content covered by one source • Information gathering – Waste time if view all broadcasts (general content → redundancy) – Miss information if only view one broadcast (specific content) • Answer: Story Compression – Detect general vs. specific content and create single story from all broadcasts with no redundancy Overview • Divide story into content segments (i.e., single idea) – Video shot (continuous scene) detection • Compare segments – Speech/text contains most of the informational content – Word similarity → Segment Similarity • Detect specific vs. general segments Word Similarity • Focus on concepts rather than specific word matching • Graph-based hierarchy of word-concept relationships – E.g., WordNet • Malik et. al 2007 – 1 , 2 = 2∙(, 1 , 2 ) , 1 +(, 2 ) • Li et. al 2003 – 1 , 2 = − 1 , 2 tanh( , 1 , 2 ) Feline Mammal Canine Poodle Object Cat Segment Similarity • Sentence similarity? – Segments range from sub-sentence to multiple sentences – Also, sentence boundaries (when multiple) poorly defined – Sentence similarity emphasizes grammar/word order; won’t work • If ordering is problematic, use unordered groups instead • Solution: Graph collapsing – Group of nodes collapsed to single node by summing edge weights – Inspired by spectral clustering and notion of random walk on graphs – Random walk between groups equivalent to random walk between collapsed nodes Segment Similarity Word Similarity Most Unique Segments • Manual segmentation employed • Specific content • Uniqueness → overall dissimilarity • Perfect dissimilarity → similarity matrix rows/columns zero except for diagonal • Thus, sum of row/column should approach zero for most dissimilar segments Most Related Segments • General content • Related → group self- similar • Perfect self-similarity → similarity matrix elements for group all one • Thus, sum of elements should approach 2 ( =number in group) 0 10 20 30 40 50 60 70 80 90 100 3.3 3.35 3.4 3.45 3.5 3.55 3.6 3.65 3.7 3.75 3.8 Segment Pair Similarity (higher is better) Similarity Segment pairs (sorted) 0 5 10 15 20 25 30 35 40 45 0.014 0.016 0.018 0.02 0.022 0.024 0.026 0.028 0.03 0.032 Segment Uniqueness (lower better) Uniqueness Segments (sorted) Perfect dissimilarity Somewhat dissimilar Perfect similarity Somewhat similar Automatic Segment Detection • How to decide boundaries between segments? – No sentence boundaries, so text not strong indicator • Shot detection: Detect visual change from one scene to another • Common techniques: – Temporal extent • Consecutive: compare sequential pairs of frames • Key frame: compare to “key” frame of previous segment – Distance measures • Pixel-based: Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), Normalized Cross-Correlation (NCC) • Color-based (histograms): χ 2, Bhattacharyya • Texture-based: Scale Invariant Feature Transform (SIFT) Towards Improving Segment Detection • Common methods give mediocre performance • May be due to only examining single temporal extent • Possible solution: Use graph collapsing to examine all temporal extents simultaneously • Sum of blocks on diagonal approaches 2 if members in segment • Sum of block anti-diagonal approaches zero if corner is segment boundary • Current problem: Scale of valleys (boundaries) varies quadratically with segment size, simple peak finding not good enough 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Shot Detection: Key Frame (First) Normalized threshold (1 = perfect match) F score SAD SSD NCC SIFT-MR BATTA-H16 CHI2-H16 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Shot Detection: Consecutive Normalized threshold (1 = perfect match) F score SAD SSD NCC SIFT-MR BATTA-H16 CHI2-H16 Method F TP FP FN SAD 0.747 0.596 0.081 0.322 SSD 0.746 0.595 0.044 0.362 NCC 0.770 0.626 0.009 0.365 BATTA-H16 0.779 0.638 0.125 0.237 CHI2-H16 0.210 0.117 0.005 0.878 0 2000 4000 6000 8000 10000 12000 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 Frame Anti-diagonal Sum Conclusion and Future Work • Graph collapsing can be used to derive group similarity from similarity of group members • Additionally, can be used to evaluate uniqueness of objects, relatedness of groups – Tested with text, working on video • Future work – Finalize graph collapsing video segmentation – Expand word similarity to include multiple languages – Investigate sub-image feature extraction/matching – Examine other sources (e.g., YouTube) “…declaring a public health emergency….” “…declaring a public health emergency….” ABC NBC #1) “…after the virus killed….” “…sadly had claimed 18 lives….” NBC CBS #2) “…declaring a public health emergency….” “…to repeat, declared a public health emergency….” ABC NBC #3) ABC CBS “…they’ve set up a special tent….” “…a tent has been setup….” #4) “In Boston today, the mayor sounded the alarm” ABC #1) “…moved onto the upper respiratory, which is a lot of coughing…” ABC #2) “…stay home when you are sick…” ABC #3) “…I’ve never been hit by a Mack truck…” ABC #4) “…is on the panel that decides what goes in the vaccine…” CBS #5) “…after confirmed cases of flu reach 700…” CBS #6) Consecutive Shot Detection Across All Stories Shot Detection on story FLU Video similarity Sum of diagonal blocks Frame Block Start Block End ABC CBS NBC