Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia a nd Expo 2000
Dec 22, 2015
Low-level Motion Activity Features for Semantic Characterization of Video
Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu
International Conference on Multimedia and Expo 2000
Introduction We want to establish connections
between low-level motion activity feature of video segments and the semantic meaningful characterization of them.
Two computationally simple descriptors for motion activity of a video content is used.
Motion Activity Descriptors act0 : monotonous (steady) motion
activity descriptor act1 : non-monotonous (unsteady)
motion activity descriptor
j
act0 is sensitive to global motion such as camera pan and to objects moving very close to camera.
act1 filters out the component of motion activity that does not change from frame to frame.
In contrast to act0 , act1 is more sensitive to unsteady motion such as fickle motion of a non-rigid object in close up.
Results from Application Examples We use the two descriptors in two
different application contexts. Browsing through a sports video. Retrieval from a database of shots.
Detecting Close-ups in Sports Video
We observe that the difference act1(n)- act0(n) is highest for close-up shots where the irregular motion of players in view is dominant over the regular global motion.
Basketball from MPEG-7 data set (10 minutes, 18000 frames ,4800 P frames)
A ground truth data is prepared manually, segmenting the video into wide angle and close-up shots.(59 segments, 30 being close-ups)
We expect m1 (act0(n)) to be high for close-up frames because zoom or when the action is close to the camera the motion vector is larger.
We expect if non-monotonous activity act1(n) is significantly higher than act0(n) in a frame, then with a high probability, the frame is a close-up on a highly active object.
Frame-based detection Two threshold for m1 and m2 to select 250 P-
frames. Bounding boxes are close-up segments. Positive impulses are where m2 suggest a
close-up. Negative impulses are where m1 suggest a
close-up.
Segment-based Detection
We find the close-up segments by sorting the segments with respect to sm1 and sm2 and choosing the top K.
The retrieval using sm1 (average of act0 over the segment) is misled by camera motion.The first retrieved segment is a fast pan segment.
We find sm2 to be a more reliable detector for close-ups.
Retrieval of High Activity Shots A database of 600 shots from MPEG-7 test
set, include various programs such as news, sports,entertainment, education, etc.
5 highest activity shots are retrieved using act0, act1 and (act1- act0).
act0 and act1 retrieve shots that contain fast camera motions or an objects that passes too close to the camera, which are not commonly considered high activity.
(act1- act0) get 5 shots of dancing people.
Conclusion We described two descriptors to infer
whether the activity content is dominantly a monotonous, steady motion or an unsteady, inconstant motion.
This kind of a characterization of the activity content can be used to detect close-up segment in a sports video or in an activity based query from a database of video shots.