Top Banner
Wavelet Video Coding – Principles, Applications and Standardization Mihaela van der Schaar Electrical and Computer Engineering Department University of California Davis
109

Wavelet Video Coding

Jan 21, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Wavelet Video Coding

Wavelet Video Coding –Principles, Applications and Standardization

Mihaela van der SchaarElectrical and Computer Engineering Department

University of California Davis

Page 2: Wavelet Video Coding

2

Outline

IntroductionScalable coding – principles (review)Basic principles of wavelets (review)Motion Compensated Wavelet Coding – basic principles and classification Motion Compensation Temporal Filtering (MCTF)Overcomplete Motion Compensated Wavelet CodingEncoding of spatio-temporal wavelet coefficientsScalable coding of motion informationError resilience aspectsCurrent status in MPEG standardizationComparisons with state-of-the-art non-scalable coding techniques

Page 3: Wavelet Video Coding

Introduction

Page 4: Wavelet Video Coding

4

Challenges for ubiquitous multimedia communication

Encoder+

Server

IP-based

< 64 k

< 512 k

< 11 M802.11b

6 M

802.11a

< 2 M802.11

64 k -

2 M3G/4G

Internet/Internet2

Page 5: Wavelet Video Coding

5

Sample of concrete problems/questions

Signal processingcompression efficiency versus quality of signal reproduction (rate-distortion tradeoffs)compression efficiency versus robustness to losses

Networkingrealistic channel models for effective joint source/channel codingsource-channel interface control strategies for efficient network resource usage and high quality signal reproduction

Computer Architecturecompression efficiency versus computational complexity

Page 6: Wavelet Video Coding

6

Possible solution: compression meets the network

Do not require the transport mechanism to be flawless (modulation,

channel coding, transmission protocol etc.), just design the coding

system and transmission jointly

Do not design for worst-case scenario - just adapt on the fly based on

the network and device characteristics

Hence:

A. Scalable Coding

B. Adaptive Streaming

Page 7: Wavelet Video Coding

7

Principles of Scalable Coding

Encoding of video signal with different resolutionscales

Downscaling of video signal byCoding noise insertion – SNR ScalabilitySpatial subsampling – Spatial ScalabilitySharpness reduction – Frequency ScalabilityTemporal subsampling – Temporal ScalabilitySelection of content – Content related Scalability

ScaleConversion& Encoding

low

medium

high

Rate / R

esolution

VideoInput

Page 8: Wavelet Video Coding

8

The Simple Way – Advance Scaling

Requires feedback about channel / decoder statusOnly point-to-point connection supportedExample : Stream switching

Coder Network DecoderScaleConverter

Page 9: Wavelet Video Coding

9

The Parallel Way - Simulcast

Run independent encoders in parallelRequires a priori knowledge about network and decoder capabilities to select optimum scalinglevelsPoint-to-multipoint connections possible

Low Scale Coder

Med. Scale Coder

High Scale Coder

Multiplex

Page 10: Wavelet Video Coding

10

Simulcast

Multiplexed transmission of streams

Loss in efficiency due to multiple streamsCan cause network overloadRestricted number of scales

Multiplex Stream

Medium rate stream

Low rate stream

High rate stream

Page 11: Wavelet Video Coding

11

The Embedded Way – Layered Coding

"Chain of layers" - information from low resolutionutilized to encode next-higher resolution

Σ

Coder Layer 1

...

Layer 1

...

Σ

Σ

...

−+

Σ−+

y1

y2

yT

(Base layer)

(Enhancem

ent layers)

x

...

Σ+

+

+ +

+ +

Q1

Q2

QT

Q1

Q2

QT

Preprocessing 1

Midprocessing 1

Preprocessing 2

Midprocessing 2

Midprocessing 1

Midprocessing 2

Coder Layer 2

Coder Layer T

Decoder Layer 1

Decoder Layer 2

Layer 2

Layer T

Decoder Layer 1

Decoder Layer 2

Decoder Layer T

Page 12: Wavelet Video Coding

12

Layered Coding

Layered coding supports embedded streamsRe-configuration of bit stream for reconstruction withdifferent spatial/temporal/quality resolution

Possible loss in efficiency depends on coding schemeIn theory, arbitrary number of scales could be achieved

Full multiplex = high rate stream

Partial multiplex = medium rate stream

Low rate stream

Page 13: Wavelet Video Coding

13

SNR Scalability – Re-quantisation

Example : 2-stage quantizer

Q1

Q2Σ

-

+

Base

Enhancement

Large steps

Small steps (≤Q1/2)Reconstruction value

Decision (threshold) value

Q1

Q2

Page 14: Wavelet Video Coding

14

SNR Scalability – Bit-plane Coding

Quantization related to bit planes No zero reconstruction,

unsignedZero reconstruction,

sign/magnitudeZero reconstruction,

sign/magnitude, dead zone

... ... ...

Bit 1 Bit 2 Bit 3 Bit 1 Bit 2 Bit 3

MIN

MAX

0

MAX

Reconstruction value

Decision (threshold) value

Bit 1 Bit 2 Bit 3

0

MAX

... ... ...... ... ... ... ... ...

Page 15: Wavelet Video Coding

15

SNR Scalability – Bit-plane Coding

Magnitude of MSB encoded by run-length orbinary entropy codingSign and remaining bits encoded binary, conditional on MSB

Sample1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Bit 5

Bit 4Bit 3

Bit 2

Bit 1

Sign

0

0

0

0

1

1

0

0

0

0

0

0

0

1

0

1

1

0

0

0

0

0

1

0

1

1

0

0

1

1

0

0

1

1

0

0

0

0

0

1

0

1

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

1

0

0

0

0

0

0

0

0

0

1

0

1

1

0

0

0

1

0

0

0

0

0

0

0

0

1

0

0

0

1

11

Run-length code4,9

2,103,5,2

3,4,1

0,1,1,0,2

binary coded

Page 16: Wavelet Video Coding

16

Spatial Scalability

Base-to enhancement prediction

Q1

Q2Σ

-

+

Base

Enhancement

Low passFilter N:1

Low passFilter 1:N

Decimation

Interpolation

Page 17: Wavelet Video Coding

17

Temporal Scalability

Temporal downsampling with temporal anti-alias filter or by frame skippingTemporal upsampling by MC prediction

Q1

Q2Σ

-

+

Base

Enhancement

Low passFilter N:1

Low passFilter (MC) 1:N

Temporalsubsampling(optional)

Page 18: Wavelet Video Coding

18

Frequency Scalability / "Data Partitioning"

Popular in context of Transform CodingAllocation of coefficients to different layersdepending on frequencyVery low complexity

DataPartitioning

Single-layerEncoder

OriginalVideo

Outputstream

Base-layerstream

Enhancement-layer stream

MUX

PriorityBreakPoint

Base-layercoefficients

Enhancement-layercoefficients

Page 19: Wavelet Video Coding

19

Multiresolution Concepts

Generate different resolution levels by successivedown/upsampling operationsResolution pyramids example : Spatial resolutionreduction by factors of 2

Full resolution

Lowest resolution...

c0c1

c2

cU-1

Page 20: Wavelet Video Coding

20

Multiresolution Concepts – Pyramids

Gaussian PyramidEach layer is self-containedCorresponds to Simulcast conceptMore samples to be encoded

4:1 ...

...

H(z1,z2) 4:1H(z1,z2)

cU-2

c0

c1

cU-1

x(m,n)

Page 21: Wavelet Video Coding

21

Multiresolution Concepts – Pyramids

Laplacian Pyramid (Differential Pyramid)All lower-resolution layers required to reconstruct high-resolution layersCorresponds to Layered Coding conceptNot critically sampled – more samples than original

ΣΣ

...

+

-+

-

cU-1

c0

c1

4:1H(z1,z2)

...

G(z1,z2)

1:4

4:1H(z1,z2)

G(z1,z2)

1:4

x(m,n)

Page 22: Wavelet Video Coding

22

Multiresolution Concepts – Pyramids

Advantages :Pyramids can be combined with any coding scheme forthe different resolution levelsDownsampling can be made alias-free

Disadvantages :Number of pixels higher than in original signalHigher data rate than one-layer coding

Possible solution :Critically sampled pyramids (Wavelets)Disadvantage : Downsampled signals bear alias

Page 23: Wavelet Video Coding

Basic Principles of Wavelets

Page 24: Wavelet Video Coding

24

Filter Pairs

Critically sampled filter bank with 2 bands

Analysis low-/highpass filter pairs H0/H1

Synthesis low-/highpass filter pairs G0/G1

Number of samples c in frequency bands equal to total number of samples in signal x

Σ

H0(z) 2:1 1:2 G0(z)

H1(z) G1(z)

c0

c1

y(n)x(n)

2:1 1:2Analysis Synthesis

Page 25: Wavelet Video Coding

25

Filter Pairs

Perfect reconstruction is possible

Subsampled signals c usually bear alias !

Σ

H0(z) 2:1 1:2 G0(z)

H1(z) G1(z)

c0

c1

y(n)x(n)

2:1 1:2

[ ]

[ ] ).()()()()(21

)()()()()(21)(

1100

1100

zXzGzHzGzH

zXzGzHzGzHzY

−⋅⋅−+⋅−+

⋅⋅+⋅=

Analysis Synthesis

=2⋅z-k

=0

Page 26: Wavelet Video Coding

26

Biorthogonality Principle

Perfect reconstruction conditions

[ ]

[ ]

0 0 1 1

0 0 1 1

1( ) ( ) ( ) ( ) ( ) ( )21 ( ) ( ) ( ) ( ) ( ).2

Y z H z G z H z G z X z

H z G z H z G z X z

= ⋅ + ⋅ ⋅

+ − ⋅ + − ⋅ ⋅ −

)()(

)()(

01

10

zHzzG

zHzzGk

k

−⋅−=

−⋅=

[ ]0 1 1 01( ) ( ) ( ) ( ) ( ) ( )2

kY z H z H z H z H z X z z= ⋅ − − ⋅ − ⋅ ⋅

0 1( ) ( ) 2 with ( ) ( ) ( )kP z P z z P z H z H z−⇒ − − = ⋅ = ⋅ −

0)()(

)()(

01

10

=−⋅⋅−−

−⋅⋅−⇒

zHzzH

zHzzHk

k

=2⋅z-k

Page 27: Wavelet Video Coding

27

Biorthogonality Principle

H0(z)/G1(-z) and H1(z)/G0(-z) constitute orthogonal pairsLow-/Highpass transfer functions not symmetricLinear phase or non-linear phase filters possibleLow-/Highpass impulse responses may have different length

Σ

H0(z) 2:1 1:2 G0(z)

H1(z) G1(z)

c0

c1

y(n)x(n)

2:1 1:2

Page 28: Wavelet Video Coding

28

Biorthogonality Principle

A simple biorthogonal filter pair (5/3 integer)

( )(5/3) 2 1 1 20

1( ) 2 6 28

H z z z z z− −= − + ⋅ + + ⋅ −

Σ

H0(z) 2:1 1:2 G0(z)

H1(z) G1(z)

c0

c1

y(n)x(n)

2:1 1:2

( )(5/3) 1 21

1( ) 1 22

H z z z− −= − + ⋅ − ( )(5/ 3) 1 1 2 31

1( ) 2 6 28

G z z z z z− − −= − + + ⋅ + ⋅ −

( )(5/3) 1 10

1( ) 22

G z z z− −= + +

Page 29: Wavelet Video Coding

29

Lazy Transform(even/odd sam

pleseparation)

PredictionP1(z)

UpdateU1(z)

+

PredictionPK(z)

UpdateUK(z)

+...

...

x

x

KL

KH

L Out

H Out

In

A

B

..ABAB..

Lifting Filters

Biorthogonal filter pairs can be factorized to be implementable in a "ladder structure""Prediction" and "Update" steps using very short filter kernels are then iteratively performed"Lifting scheme" is most efficient implementation of wavelet filters available so far

Page 30: Wavelet Video Coding

30

+ +

Lazy Transform(even/odd sam

plegrouping)

PredictionP1(z)

UpdateU1(z)

PredictionPK(z)

UpdateUK(z)

1/KL

1/KH

L In

H In

Out

A

B..ABAB..

Lifting Filters

Synthesis filter pair is implemented by inverse signal flowPerfect reconstruction is obvious

Quantization of signals in the ladder branches gives integer realization of analysis and synthesis

Page 31: Wavelet Video Coding

31

Lifting Filters

Signal flow diagrams of lifting implementations for (5/3) filters and Haar (2/2) filters

−1

1

1

−1

1

1/2

1

1

1

1x(2m')

1/2

1/2

−1

−1/2

−1/2

1

1

1

1/4

1

1

1

1

−1/2

−1/2

−1/2

−1/2

1/4

1/4

1/4

1/4

1/4

x(2m'+1)

c1(m')

c0(m')

x(2m'+3)

x(2m'+2)

x(2m'+5)

x(2m'+4)

x(2m'+6)

c1(m'+1)

c0(m'+1)

c1(m'+2)

c0(m'+2)

c0(m'+3)

x(2m')

x(2m'+1)

c1(m')

c0(m')

x(2m'+3)

x(2m'+2)

x(2m'+5)

x(2m'+4)

x(2m'+6)

c1(m'+1)

c0(m'+1)

c1(m'+2)

c0(m'+2)

c0(m'+3)

a) b)

Page 32: Wavelet Video Coding

Motion Compensated Wavelet Coding –basic principles and classification

Page 33: Wavelet Video Coding

33

Wavelet Video Coding - Classification

Intraframe coding (e.g. MJPEG)3D wavelet coding without MCHybrid video coding using wavelet-based texture codingIn-Band Motion Compensation PredictionMotion Compensated Temporal FilteringIn-Band Motion Compensated Temporal Filtering

Page 34: Wavelet Video Coding

34

History

Using transforms for interframe coding goes back to the 1970/1980s (e.g. Karlsson/Vetterli)Drawback was lack of motion compensation –first approch to filter over motion trajectories proposed by Kronander (1990)Solution avoiding an overcomplete transformdeveloped by Ohm (1991,1994)Solution for perfect reconstruction in case of half-pel motion by Ohm/Rümmler (1997), Hsiang and Woods (1999)Different researchers proposed combination with temporal axis lifting scheme which makesvirtually any MC possible : Pesquet/Bottreau, Luo/Li/Zhang, Secker/Taubman (2001)

Page 35: Wavelet Video Coding

35

H

LH

LLL LLH

video sequence

1st temporal level

2nd temporal level

3rd temporal level

Three-dimensional Wavelet

Temporal decomposition of a group of 8 frames(3 levels of wavelet transform)

Page 36: Wavelet Video Coding

36

Three-dimensional Wavelet Coding

Extensionof zero treeapproachto temporaldimension

Non-recursive coding structure

Examples:"3D SPIHT" by Pearlman et al.Layered Zero Coding (LZC) by Taubman and Zakhor (only constant displacement motion compensation)

Page 37: Wavelet Video Coding

37

Wavelets and Motion Compensation

Motion compensation is keyTo achieve good compression performance To guarantee visual quality – non MC/interframe coding with same SNR usually looks worse

Motion-compensated Wavelet video codingTemporal MC prediction followed by Wavelet TransformWavelet Transform followed by temporal MC prediction inwavelet domain3D Wavelet with MC

Page 38: Wavelet Video Coding

38

Hybrid Video Coding using Wavelets

Replacement of DCT by Wavelet for 2D encodingin MC prediction loop

Σ Q

ΣMC

+

-

+

+

DWT

IDWT

C Σ+

MC

+D... IDWT

ME

Page 39: Wavelet Video Coding

39

Hybrid Video Coding using Wavelets

Problems and possible solutions:

Wavelet analysis is block-overlapping, discontinuities inmotion vector field cause problems

Overlapping-block MC

Local switching between Intra/inter modes not block-wiseSymmetric extension at block discontinuities

Drift problem in MC loop is not solved

This is not a real scalable solution

Page 40: Wavelet Video Coding

40

Motion compensation in the wavelet domain

Multi-resolution nature of wavelet decomposition is ideal for providing spatially scalable video (QCIF, CIF, SD, and HD)Subbands are highly correlated in the temporal direction

Motion estimation and compensation can significantly reduce the temporal correlation

Classical approachMRMC (multi-resolution motion compensation scheme)Ref: Y. Zhang and S. Zafar, IEEE CSVT, Sept. 1992.

Page 41: Wavelet Video Coding

41

Multi-Resolution Motion Compensation

Page 42: Wavelet Video Coding

42

MC in Wavelet Domain - Encoder

2DDWT

Q

MC

Σ+

-

ΣFS(low)

+

+

Q

MC

Σ+

-

ΣFS(med.)

+

+

Coeff.COD.

MotionEstimation

:2

:2

... ...

MUX

MotionCoding

Q

MC

Σ+

-

ΣFS(high)

+

+

Page 43: Wavelet Video Coding

43

MC in Wavelet Domain

Variable block size of the m-th layer subbands for M-level decomposition

Motion vector for each subband (j=1,..,3)

Adaptive search range for each subband

mMmM pp −− × 22

),(2),(),( )(,

)(0,

)(, yxyxVyxV m

jimMm

imji ∆+= −

i: frame number, j: subband index (j=0,…,3), m: layer number

Page 44: Wavelet Video Coding

44

MC in Wavelet Domain – Advantages andDrawbacks

Multiple (separate) MC loops for wavelet bandsone set of motion parameters may be used for all

No drift problem in spatial scalabilityPossible to skip higher frequency bands

Switching to "intra" coding mode without penaltyInverse DWT is applied to images (not differences)

Inefficiency of MC prediction in high bandsSignificant performance loss compared to ME/MC in spatial domain (e.g.1-2dB) The shift variant property of wavelet decomposition

Page 45: Wavelet Video Coding

Motion-Compensated Temporal Filtering

Page 46: Wavelet Video Coding

46

Motion Compensated Haar Filters

Non-orthonormal Haar filter basisMC shift

Delay by one frame( )10 1 2 3

11 1 2 3

1( ) 12

( ) .

k l

k l

H z z z z

H z z z z

= + ⋅ ⋅

= − ⋅ +

% %

Page 47: Wavelet Video Coding

47

Motion Compensated Haar Filters

This motion-compensated filtering is no problem whenever unique sample-wise correspondences exist between two frames

Real motion vector fields are discontinuous, such that correspondences may not be unique

( , ) ( , )

( , ) ( , )( , )

with( , ).

B B A A B A A A

B B A A B A A A

m n m n

m n m n

k k m m k m nl l n n l m n

=

=

− = +⎧⎨− = +⎩

%

%

?

? ?

??

covered/multiple connecteduncovered/unconnected

?

origin of motion trajectory

Page 48: Wavelet Video Coding

48

Motion Compensated Haar Filters

Substitution technique for covered/uncovered areas allows perfect reconstruction at motion discontinuities

covered

uncovered

MC prediction fromprevious frame

Insert originalvalues from B

A B H L

?

?

B-1

( )( )

( , ) 0.5 ( , ) 0.5 ( , ), ( , )

( , ) ( , ) ( , ), ( , )

L m n B m n A m k m n n l m n

H m n A m n B m k m n n l m n

= ⋅ + ⋅ + +

= − + +

% %

if 'unconnected'( , ) ( , )ˆ( , ) ( , ) ( , ) if 'multiple connected'

L m n B m n

H m n A m n A m n

=

= −

( )( )1

ˆ 'backward mode'( , ) ( , ), ( , )ˆ 'forward mode'( , ) ( , ), ( , )

A m n B m k m n n l m n

A m n B m k m n n l m n−

= + +

= − −

"unconnected"

"multipleconnected"

Page 49: Wavelet Video Coding

49

Motion Compensated Haar Filters

Synthesis is straightforward in case of full-pixel correspondences

( )( )

( , ) ( , ), ( , ) 0.5 ( , )

( , ) ( , ) 0.5 ( , ), ( , ) ,

A m n L m k m n n l m n H m n

B m n L m n H m k m n n l m n

= + + +

= − + +

%

% %%

( , ) ( , ) if 'unconnected'ˆ( , ) ( , ) ( , ) if 'multiple connected'.

B m n L m n

A m n A m n H m n

=

= +

%

%

Page 50: Wavelet Video Coding

50

Σ

Σ

:2

L

H

M

2:1

2:1

Motionestimation

Connection &mode switch

analysis

Motioncompensation

I

+

+

-

+

z-1

U/I

Switch positions :U - unconnectedI - intraframeF/B - forward/backward

Motioncompensation

Frame B

Frame A

I

z-1 Frame B-1

F/B

S

Motion Compensated Haar Filters

2-band temporal Haar analysis filterH0

H1

Page 51: Wavelet Video Coding

51

L

H

M

x

Σ Σ O1:2

1:2Σ

0.5 1

+

+

+

-

z-1

M/I

M/I*)

U/I

Switch positions :U - unconnectedM - multiple connectedI - intraframeF/B - forward/backward

*) Switch open for I

Frame B

Frame A

Connection &mode switch

control

Motioncompensation

Motioncompensation

:2:2 Motioncompensation

z-1

Frame B-1

F/B

S

Motion Compensated Haar Filters

2-band temporal Haar synthesis filterG1

G0

Page 52: Wavelet Video Coding

52

Coding of motion information

IL

HM

IL

HM

IL

HM

IL

HMO

L

H M OL

H M OL

H MO

L

H M

2-D W

avelet decom

position,quantization, encoding

x y

Decoding of motion information

Motion Compensated Temporal Wavelet Tree

Scaling and Wavelet coefficientsfrom temporal analysis (arranged as 2D images)

Page 53: Wavelet Video Coding

53

Motion-compensated Lifting Filters

Signal flow diagram

−β

β−1

1

1

−β

β−1

1

(1−β)/2

1

1

1

β/2

1

"A" "B" "H" "L"

B*

A* (1−β)/2

β/2

(1−β)/2

−β

β−1

β/2

B(m,n-1)

B(m,n)

B(m,n+1)

A(m-k,n-l)

A(m-k,n-l+1)

Vertical shiftby pixels

l β+

Page 54: Wavelet Video Coding

54

Motion-compensated Lifting Filters

Extensible to longer interpolation filters, e.g. (9/7)

2(1−β)p1

1

1

1

1

1

1

1

"A" "B" "I1" "I2"

B*

A*1

1

1

1

1

"H"

1

1

1

1

1

"L"

1

1

1

1

1

1

2βp1

2(1−β)p1

2βp1

2(1−β)u1

2βu1

2(1−β)p2

2βp2

2(1−β)u2

2βu2

2(1−β)p1

2βp1

2(1−β)p1

2βp1

2(1−β)u1

2βu1

2(1−β)u1

2βu1

2(1−β)u1

2βu1

2(1−β)p2

2βp2

2(1−β)p2

2βp2

2(1−β)p2

2βp2

2(1−β)u2

2βu2

2(1−β)u2

2βu2

2(1−β)u2

2βu2

With β=0.5: Equivalent to the half-pel P.R. method proposed in [Ohm,Rümmler 97]and used in [Hsiang, Woods 99]

Page 55: Wavelet Video Coding

55

Motion-compensated Lifting filters

The principle is straightforwardly extensible tolonger wavelet filters separable (or non-separable 2D filters)change of a with any position (e.g. MC based on affine model, dense motion vector fields)

Coincidence of motion correspondences in adjacent prediction and update steps must be observedLifting implementation of temporal wavelet filtering also leads to an elegant interpretation of previous covered/uncovered pixel substitutionVery efficient implementation

Page 56: Wavelet Video Coding

56

Motion-compensated Lifting filters

Adaptation at motion boundaries :"uncovered/unconnected" caseAdditional "lazy"pixel(s) in frame B −β1

β1−1

1

1

−β2

β2−1

1

(1−β1)/2

1

1

1

β1/2

1

"A" "B" "H" "L"

B*

A*

(1−β2)/2

β2/2

−β2

β2−1

−β1

1

1

β1/2

(1−β1)/2

β1−1

(1−β2)/2

Motion boundary#

A*

B*

Page 57: Wavelet Video Coding

57

Motion-compensated Lifting filters

Adaptation at motion boundaries :"Covered/multiple connected" caseAdditional predictionpixel(s) in frame A/H −β1

β1−1

1

1

−β2β2−1

1

(1−β1)/2

1

1

1

β2/2

1

"A" "B" "H" "L"

B*

A*

−β2β2−1

−β1

1

1

β1/2

(1−β2)/2

β1−1

β2/2

(1−β2)/2

(1−β2)/2

Motion boundary

#

B*

A*

This pixel might alsotake the 'unconnected'role!

Page 58: Wavelet Video Coding

58

Motion-compensated Lifting filters

Frame-wise or localized implementation of intra coding is a key concept in MC prediction coders

Switching to intra mode is applied whenever no motion correspondence can be found, e.g. scene changes or uncovered areas

In MC temporal filteringthe equivalent is an adaptation of wavelet tree depthbut intra coding could also be applied individually in the prediction and update steps

In general, localized mode switching can be included in a simple way in the lifting structure

Page 59: Wavelet Video Coding

59

More Flexibility in MC Lifting Filters

Different view ofone transform level: Temporal-axis lifting filters, including 2D MC in cross pathsMC and IMCshould be related such that pixels fromA correspondto L

video sequence

MC MC MC

IMC IMC

highpass sequence

lowpass sequence

B A B A B A

IMC

-1

1/2 1/2

1 -11 -11

1/2 1/2 1/2 1/2

H H H

L L L

Page 60: Wavelet Video Coding

60

More Flexibility in MC Lifting Filters

Extension to longertemporal filters (5/3)H frames equivalentto bidirectionalpredictionForward/backwardswitching possibleBetter codingefficiencyNo temporalblockingMore memoryHigher delayMore motion vectors

MC MC MC MC MC MC

IMC IMC IMC IMC

B A B A B A B

IMC IMC

MC

IMC

MC

IMC

-1/2 -1/2 -1/2 -1/2 0 -11 1 1

1/4 1/41 1/4 1/41 1/4 01 1/2 1/41

H H H

L L L L

video sequence

highpass sequence

lowpass sequence

switch touni-directional

Page 61: Wavelet Video Coding

61

More Flexibility in MC Lifting Filters

Non-dyadicdecompositionTemporal blockunits of 3 framesE.g. 30/10 HztemporalscalabilityCan be extendedto bidirectionalMC in prediction step

video sequence

MC

IMC

highpass sequence

lowpass sequence

B A

-1

1/4

1

H

L

MC

IMC

B

-1 1

1/41/2

H

MC

IMC

B A

-1

1/4

1

H

L

MC

IMC

B

-1 1

1/41/2

H

Page 62: Wavelet Video Coding

62

Low-Delay modes in MC Temporal Filtering

Frame 1 Frame 2 Frame 3 Frame 4

Leave as original

A AH H

Leave as original

A A A H

Level 1

Level 2Filter A frames fromprevious level

Temporal pyramid decomposition with omission of update step ("A" frames left as originals)

Page 63: Wavelet Video Coding

63

Low-Delay modes in MC Temporal Filtering

Frame 1 Frame 2 Frame 3 Frame 4

A HH A

Modified MCTF Scheme

Leave as original Leave as original

"A" frames can be inserted at arbitrary locations ->

the sequence can be decoded at non-dyadic lower frame rates

Page 64: Wavelet Video Coding

64

Low-Delay modes in MC Temporal Filtering

A frames allow implementation of a low-delay mode

A frames can be encoded and transmitted immediately, but must be stored for future referenceH frames can be encoded and transmitted immediately in any of the schemes

Disadvantage : lower coding efficiencyMay be compensated by improved prediction

Page 65: Wavelet Video Coding

65

Low-Delay modes in MC Temporal Filtering

Frame 1 Frame 2 Frame 3 Frame 4

A HH A

Leave as original Leave as original

Inclusion of bi-directional prediction

Choice between 3 modes:

- Use backward prediction

- Use forward prediction- Use the average block of the backward and forward predictions while filtering

Page 66: Wavelet Video Coding

66

Low-Delay modes in MC Temporal Filtering

Prediction step can be enriched by selecting multiple reference frames

Reference Frame 1 Reference Frame 2 Current Frame

AdvantagesImproved coding efficiencyEasy to incorporate the advanced MC & ME options used by predictive coders (H.264/AVC, MPEG-4 etc.)Reduced no. of unconnected pixels

DisadvantagesSacrifice Temporal ScalabilityPrediction drift can become a problem when decoding at lower bit-rates

Page 67: Wavelet Video Coding

67

Frame 1 Frame 2 Frame 3 Frame 4

Frame 1 Frame 2 Frame 3 Frame 4

Frame 1 Frame 2 Frame 3 Frame 4

Frame 1 Frame 2 Frame 3 Frame 4

A A A A

A A A A

AHAH Scheme

Bi-directionalAHAH Scheme

AHHA Scheme

Bi-directionalAHHA Scheme

Low-Delay modes in MC Temporal Filtering

Different configurations at any level of pyramid

Page 68: Wavelet Video Coding

OvercompleteMotion Compensated Wavelet Coding

•Shift-variant property of wavelets•Frame theory - overcomplete wavelets•Low band shifting method•Inband motion compensated temporal filtering•Simulation results

Page 69: Wavelet Video Coding

69

The Shift Variance Property of Wavelets

Haar filter output of step edgesHaar - DWT

L 1

L 1

H 1

H 1

Signal

Signal shiftedby one pixel

Low pass channel :prediction by linear interpolation

High pass channel :no prediction possible !

Page 70: Wavelet Video Coding

70

The Shift Variance Property of Wavelets

Page 71: Wavelet Video Coding

71

The Shift Variance Property of Wavelets

Suppose By substituting

Hence,

( ) 2/0

)2/(11 1

2221)()( ωπω πωπωωω jvjvj eeXHeXY −−− −⎟

⎠⎞

⎜⎝⎛ +⎟

⎠⎞

⎜⎝⎛ +=−

Aliasing components (zero only when v = even)

Page 72: Wavelet Video Coding

72

Optimal Aliasing Reduction Filter Approach

In order to minimize the aliasing in wavelet domain ME/MC (X.Yang, K. Ramchandran, IEEE-TIP, May, 2000)

L : aliasing reduction filter

Page 73: Wavelet Video Coding

73

Optimal Aliasing Reduction Filter Approach

Page 74: Wavelet Video Coding

74

Optimal Aliasing Reduction Filter Approach

Still not efficient as motion estimation in spatial domain

Any ultimate solution ? Shift invariant Overcomplete Wavelets

Page 75: Wavelet Video Coding

75

Frame Theory – Overcomplete Wavelets

Properties of redundant frameNoise reductionMore sparse representation matching pursuitRedundant representation multiple description codingShift invariant property

Improved motion estimation/compensation in wavelet domainOnly motion references need to be overcompleteTexture coding is still in complete wavelet domain

Page 76: Wavelet Video Coding

76

Haar-DWT

L1

L1

H1

H1

Haar-ODWT Signal

Shift Invariance of Overcomplete Wavelets

Overcomplete representation without downsampling

Signal shiftedby one pixel

Low pass channel :prediction by linear interpolation

High pass channel :no prediction possible !

Prediction possiblein any case !

Page 77: Wavelet Video Coding

77

Low-Band-Shift Method

• Optimal way of generating overcomplete wavelet coefficients for every shift

Page 78: Wavelet Video Coding

78

(0,0) (2,0) (1,0) (3,0)

(0,2) (2,2) (1,2) (3,2)

(0,1) (2,1) (1,1) (3,1)

(0,3) (2,3) (1,3) (3,3)

Low Band Shift Method for 2-D

Originalimage

(0,0) (1,0)

(0,1) (1,1)

(0,0) (1,0)

(0,1) (1,1)

(0,0) (1,0)

(0,1) (1,1)

(0,0) (1,0)

(0,1) (1,1)

LL HL LH HH

(0,0) (2,0) (1,0) (3,0)

(0,2) (2,2) (1,2) (3,2)

(0,1) (2,1) (1,1) (3,1)

(0,3) (2,3) (1,3) (3,3)

(0,0) (2,0) (1,0) (3,0)

(0,2) (2,2) (1,2) (3,2)

(0,1) (2,1) (1,1) (3,1)

(0,3) (2,3) (1,3) (3,3)

(0,0) (2,0) (1,0) (3,0)

(0,2) (2,2) (1,2) (3,2)

(0,1) (2,1) (1,1) (3,1)

(0,3) (2,3) (1,3) (3,3)

LLLL LLHL LLHHLLLH

(x,y): shift in (x,y) pixels in original image

# of reference frames= 3n+1

: bands used for complete wavelet expansion

Page 79: Wavelet Video Coding

79

Conventional Wavelet Transform

Original image

LL HL LH HH

LLLL LLHL LLLH LLHH

Page 80: Wavelet Video Coding

80

Overcomplete Wavelet Transform by Low-Band Shift Method

LL

Original frame

LL HL LH HH

LLLL LLHL LLLH LLHH

Page 81: Wavelet Video Coding

81

Overcomplete Wavelet MC Coding - Coder

Q

MC

Σ+

-

ΣFS(low)

+

+

Q

MC

Σ+

-

ΣFS

(med.)

+

+

IDWTODWT

Q

MC

Σ+

-

ΣFS

(high)

+

+

IDWTODWT

Coeff.COD.

MotionEstimation

:2

:2

... ...

MUX

MotionCoding

2DDWT

Page 82: Wavelet Video Coding

82

Overcomplete Wavelet MC Coding - Decoder

Coeff.Decoder

MC

Σ+

+

FS(low)

MC

Σ+

+FS

(med.)IDWTODWT

MotionDecoder

:2

:2

...

MC

Σ+

+FS

(high)IDWTODWT

Low resolutionreconstruction

Medium resolutionreconstruction

High resolutionreconstruction

DMUX

...

Page 83: Wavelet Video Coding

83

Overcomplete Wavelet MC Coding

ODWT is Wavelet without subsamplingMore samples than original, like Pyramid representation

Allows Wavelet domain MC for high frequency componentssignal does not bear frequency-inversion alias component

Still only necessary to encode critically sampled coefficientsOvercomplete transform coefficients can be generated locally within the decoder

Still does not resolve the drift in SNR scalabilitymay be solved by multiple loops in each wavelet band

Solution: In-Band MCTF (IBMCTF)

Page 84: Wavelet Video Coding

84

Spatial-Domain MCTF (SDMCTF)

EC

DWT

SBC

T R A N S M I S S I O N

Video METemporal Filtering

MV and Ref. Frame No.

MCTF

Current frame

MVCDWT: Discrete Wavelet Transform SBC: Sub-Band Coder EC: Entropy Coder ME: Motion Estimation MVC: Motion Vector Coder

Page 85: Wavelet Video Coding

85

In-band MCTF (IBMCTF)

EC

DWT

SBC

TRANSMI SSI ON

Video

METemporal Filtering

MV and Ref. Frame No.

MCTF

CODWTCurrent frame

MVC

DWT: Discrete Wavelet Transform SBC: Sub-Band Coder EC: Entropy Coder CODWT: Complete to Overcomplete DWT ME: Motion Estimation MVC: Motion Vector Coder

Page 86: Wavelet Video Coding

86

IBMCTF: concept

temp

hor

ver

Page 87: Wavelet Video Coding

87

IBMCTF Wavelet Video

temp

hor

ver

For efficient IBMCTF, ME should be performed in overcomplete wavelet domain

Page 88: Wavelet Video Coding

88

3-D decomposition structure

temp

ver

hor

temp

hor

ver

SD- MCTF Inband MCTF

Page 89: Wavelet Video Coding

89

Block diagram of IBMCTF coder

Wavelet transform

Input Video

TextureCoding

Motion Estimation

Temporal Filtering

MV and Ref.Frame No.

IBMCTF 1

Break into GOFsBand 1

Band 2

Band N

….. Break into GOFs

Break into GOFs

TextureCoding

Motion Estimation

Temporal Filtering

MV and Ref.Frame No.

IBMCTF 2

TextureCoding

Motion Estimation

Temporal Filtering

MV and Ref.Frame No.

IBMCTF N

Bitstream

Page 90: Wavelet Video Coding

90

IBMCTF coding

Allows Wavelet domain MC using shift-invariant overcomplete waveltes by Low-Band Shift methodStill only necessary to encode critically sampled coefficients Advantages in spatial scalabilityResolve the drift in SNR scalabilityAdaptive processing for each subband

Different ME accuracy, interpolation filter, temporal filtertaps, etc.

Very general framework which can be combined with other existing techniques (intra mode, UMCTF, etc)

Page 91: Wavelet Video Coding

91

Results

Foreman, 300 fs, full-pel ME/MC, 30 fps, CIF

30

31

32

33

34

35

36

500 600 700 800 900 1000 1100 1200 1300 1400 1500bitrate (Kbps)

PSN

R Y

(dB

)

IBMCTF, level-by-level CODWTSDMCTFIBMCTF, full CODWT

Page 92: Wavelet Video Coding

92

Results

Foreman, 300 fs, full-pel ME/MC, 30 fps, CIF

30

31

32

33

34

35

36

500 600 700 800 900 1000 1100 1200 1300 1400 1500bitrate (Kbps)

PSN

R Y

(dB

)IBMCTF, level-by-level CODWTSDMCTFIBMCTF, full CODWT

Foreman, 300 fs, full-pel ME/MC, 15 fps, QCIF

31

32

33

34

35

36

37

38

39

150 200 250 300 350 400 450 500bitrate (Kbps)

PSN

R Y

(dB

)

IBMCTF, level-by-level CODWTSDMCTFIBMCTF, full CODWT

Page 93: Wavelet Video Coding

93

Results

Foreman, 300 fs, full-pel ME/MC, 7.5 fps, Q-QCIF

36373839404142434445464748

100 110 120 130 140 150 160 170bitrate (Kbps)

PSN

R Y

(dB

)

IBMCTF, level-by-level CODWTSDMCTFIBMCTF, full CODWT

Page 94: Wavelet Video Coding

94

Generation of Wavelet Blocks

Wavelet block provides a direct association between the wavelet coefficients and what they represent spatially

ME is done based on wavelet block No motion vector overhead because the number of the motion vector to be coded is the same as that of SDMCTFPerfectly aligned with tree structure entropy coder

Entropy based motion estimation criterion !!

Page 95: Wavelet Video Coding

95

Proposed interleaving of overcomplete wavelet coefficients

Coef. for shift=0

Coef. for shift=1

Interleaved coef.

Page 96: Wavelet Video Coding

96

Overcomplete Wavelet Transform with Interleaving

L HL LH HH

LLL LLH LLL LLHH

Original frame

Page 97: Wavelet Video Coding

97

Advantages of Interleaving

Interleaving algorithm enables optimal sub-pixel accuracy motion estimation and compensation in IBMCTFBy interleaving, any existing ME module (HVSBM, FSBM, Intra Mode, etc) with any fractional pelaccuracy can be usedCan be easily used for MCTF framework with any fractional pel accuracy using lifting structure

Page 98: Wavelet Video Coding

98

3-D Lifting Structure for IBMCTF

Direct extension of SD-MCTF lifting to IBMCTF:

[ ] [ ] [ ]( ) 3,...,0,2)(),(~,, =−−−= indnmdmAnmBnmH ij

ij

ij

ij

ij

Interpolation operation for frame is not optimal (no cross-phase dependencies incorporated)

ijA

Page 99: Wavelet Video Coding

99

3-D Lifting Structure for IBMCTF

[ ] [ ] [ ]( ) 3,...,0,22,2~_,, =−−−= idndmALBSnmBnmH nj

mji

jij

ij

[ ] [ ][ ] 3,..,0)(),(2

,2~_)(),(

=−−+

+−+−=−−

indnmdmA

ddnddmHLBSndnmdmLij

ij

ij

nnmmji

jij

ij

ij

21B

11B

31B

02B 2

2B12B 3

2B

Page 100: Wavelet Video Coding

100

Results

"Foreman"300 frames, 30fps, CIF

26

28

30

32

34

36

38

40

42

0 500000 1000000 1500000 2000000 2500000

bps

PSN

R (d

B) SDMCTF (1/8)

IBMCTF (1/8)

SDMCTF

IBMCTF

Page 101: Wavelet Video Coding

101

Overcomplete wavelet coding using standard-compliant DCT base-layers

EC

SBC

T

R

A

N

S

M

I

S

S

I

O

N

Video

METemporal Filtering

MV and Ref.Frame No.

MCTF

CODWTHigh -frequency

bands

MVC

DWT : Discrete Wavelet Transform SBC : Sub - Band CoderEC : Entropy Coder CODWT : Complete to Overcomplete DWTME : Motion EstimationMVC : Motion Vector Coder

Low -frequency band

MPEG -compliant coding/decoding

Residual information

DWT -

decoded pictures

Proposed by Andreopoulos, van der Schaar, et al – ICIP 2003

Page 102: Wavelet Video Coding

102

Results

Page 103: Wavelet Video Coding

Current Status in MPEG Standardization

Page 104: Wavelet Video Coding

104

MPEG's Scalable Coding History

Development of scalable video coding solutions has a long history in MPEG, starting from MPEG-2

Spatial, temporal and SNR scalability with at most 3 levelsMPEG-4 Fine granularity scalability

So far, all standardized solutions have shown deficiency in coding performance which is mainly due to recursive MC structure

Drift occurs when not all information is available Drift-free structures are less coding efficient

Page 105: Wavelet Video Coding

105

MPEG's Interframe Wavelet Coding Exploration

New embedded wavelet solutions were proposed In theDigital Cinema Call for Proposals and in the Call for Proposals on improved coding efficiency(both due July 2001) At Pattaya meeting (Dec. 2001), MPEG started an Adhoc Group to explore Interframe Wavelet CodingDifferent methods were investigated

MC prediction with intraframe (2D) waveletIn-band MC prediction based on overcomplete 2D wavelet decomposition3D (spatio/temporal) wavelet coding based on MCTF

3D (t+2D, 2D+t) showed most promising, providing excellent coding efficiency while being fully scalable in temporal, spatial and quality resolutionExperimental software was used

Page 106: Wavelet Video Coding

106

MPEG's Interframe Wavelet CodingExploration

The Interframe Wavelet exploration was successfully completed in October 20029 Call for Evidence on Scalable Coding Advances - July 200324 Call for Proposal Responses – Mach 2004

Page 107: Wavelet Video Coding

107

Some less good results (out of 10 sequences)

SNR Results from MPEG's Intraframe Wavelet Coding Exploration

3.5 dB

1.45 dB

1.75 dB

0.75 dB

AVC 1

AVC 2

MCTF

AVC 1

AVC 2MCTF

Page 108: Wavelet Video Coding

108

Some more good results (out of 10 sequences)

SNR Results from MPEG's Intraframe Wavelet Coding Exploration

3.6 dB

0.01 dB

1.42 dB

0.4 dBAVC 1

AVC 2

MCTF

AVC 1

AVC 2

MCTF

Page 109: Wavelet Video Coding

109

Acknowledgements

Jens OhmYiannis AndreopoulosJong YeKonstantin HankeClaudia MayerThomas Rusert