Object Based Video Coding - A Multimedia Communication Perspective Muhammad Hassan Khan 2004-03-0020.

Post on 21-Dec-2015

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Object Based Video Coding - A Multimedia Communication Perspective

Muhammad Hassan Khan

2004-03-0020

Recap … Motivation for Video Coding Today’s Video Coding Problems with today’s video coding Desirable Features Solution to get desirable features

Object Based Video Coding MPEG-4 Support Model Based Coding

Major Problem: Segmentation Segmentation by Graph Cuts

Overview of Today’s Presentation Details of the Segmentation Process

Segmentation using Graph Cuts Results

What can we do once we have the segmented regions

Block-based Vs Parametric Motion Representation

Compatibility with MPEG-4

Segmentation using Graph Cuts Lets quickly see what is a graph cut!

Segmentation using Graph Cuts Lets now see what is Max Flow – Min Cut

34

7

2

2

29

92

Segmentation using Graph Cuts How does it relate to segmentation of images?

It is primarily a pixel labeling problem Consider we want to label a pixel

D = Distance Function (Depends on the current pixel) S = Smoothness Function (Depends on the neighborhood)

To be minimized = α D + (1- α) S α serves as a prior! Hence graph based segmentation answers the question:

What is the best segmentation, given this function? We still haven’t answered how the two relate…

Segmentation using Graph Cuts Let us construct a simple graph to see how the two (graph

cuts and segmentation of images) relate

α

β

D(α)

D(β)

S

Segmentation using Graph Cuts Start with an initial labeling Find the Min-Cut Adjust the labels Iterate until a good minimization of the

function is reached

Results

Results

Results

What can we do once we have the segmented regions? Shape Description

Generalized Hough Transform R-Table based representation We need to know a few things

Centroid of a shape

Texture Model Not explored in detail yet!

Example for Centroid

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 1 1 0 0 0

0 0 1 0 0 1 0 0

0 1 0 0 0 0 1 0

0 1 0 0 0 0 1 0

0 0 1 0 0 1 0 0

0 0 0 1 1 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

)5.3,5.4(),( cr

5.4

)776655443322(12

1

r

r

5.3

)665544332211(12

1

c

c

Shape Description-Finding Centroid For each

boundary point Find r = (x’, y’)

xc = x + x’

yc = y + y’

Φ is the angle which the tangent at (x, y) makes with the x-axis

φx’

y’

(xc, yc)

r

x

y

Shape Description-Creating R-Table R-Table

φx’

y’

(xc, yc)

r

x

y

(xc, yc)

x

yΦ=0

Φ=45

Φ=90

Φ=135

Φ=180

Φ=225

Φ=270

Φ=315

Shape Description-Creating R-Table

Encoding The R-Table

This can heavily exploit the redundancy between the magnitudes and directions of R-Table entries

We might as well go for DPCM Heaven Knows

Benefits Objects encoded independently and can hence be

manipulated independently in the transform domain

Block-based Vs Parametric Motion Representation Block based

Use variable block sizes within the segmented object based on texture model

Use smaller blocks around the boundary pixels Parametric Motion

We know that given that the world is planer, two images taken from a perspective camera of the same scene are related by a projective transformation

We can assume each object to lie in a plane, similar to the concept of VOP, and compute the projective transformation parameters to estimate motion

Compatibility with MPEG-4

hierarchically multiplexeddownstream control / data

hierarchically multiplexedupstream control / data

audiovisualpresentation

3D objects

2D background

voice

sprite

hypothetical viewer

projection

videocompositor

plane

audiocompositor

scenecoordinate

systemx

y

z user events

audiovisual objects

speakerdisplay

user input

Hierarchical Description

The scene divided into objects

Our Shape/Texture Representation Goes Here

References

Gary J. Sullivan, Pankaj Topiwala, Ajay Luthra SPIE Conference on Applications of Digital Image Processing XXVII, Special Session on Advances in the New Emerging Standard: H.264/AVC, August, 2004

Gabriel Antunes, Abrantes, Fernando Pereira, MPEG-4 Facial Animation Technology : Survey, Implementation and Results, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9, No. 2, March 1999

Roger H Clarke, Image and Video Compression: A Survey Department of Computing and Electrical Engineering, Heriot-Watt University, Riccarton, Edinburgh EH14 4 AS, Scotland.

Noel Brady, MPEG-4 Standardized Methods for the Compression of Arbitrarily Shaped Video Objects, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9, No. 8, December 1999

Boykov, Y.; Veksler, O.; Zabih, R.; Fast approximate energy minimization via graph cuts, Pattern Analysis and Machine Intelligence, IEEE Transactions on Volume 23,  Issue 11,  Nov. 2001 Page(s):1222 - 1239

References

P. Gerken, “Object-based analysis-synthesis coding of image sequences at very low bit rates,” IEEE Circuits System. Video Technology., vol. 4, pp. 228–235, June 1994.

T. Kaneko and M. Okudaira, “Encoding of arbitrary curves based on the chain code representation,” IEEE Trans. Communications., vol. 33, July 1985.

P. Nunes, F. Marques, F. Pereira, and A. Gasull, “A contour-based approach to binary shape coding using a multiple grid chain code,” Signal Process. Image Communications., to be published.

Moving Picture Experts Group. [Online]. Available www:http://www.cselt.it/mpeg

G. Abrantes and F. Pereira, “Interactive analysis for MPEG-4 facial models configuration,” in EUROGRAPHICS’98–Short Presentations, Lisbon, Portugal, Sept. 1998, pp. 1.6.1–1.6.4.

top related