Recursive End-to-end Distortion Estimation with Model-based Cross-correlation Approximation Hua Yang, Kenneth Rose Signal Compression Lab University of California, Santa Barbara
Recursive End-to-end Distortion Estimation with
Model-based Cross-correlation Approximation
Hua Yang, Kenneth Rose
Signal Compression Lab
University of California, Santa Barbara
9/17/2003 ICIP 2003 2
Outline
Introduction
Problems in applying ROPE to sub-pixel prediction
Model-based cross-correlation approximation
Simulation results
Conclusions
9/17/2003 ICIP 2003 3
Introduction Video transmission over lossy networks
Raw VideoRaw Video
Source Encoder
Source Encoder
Channel Encoder
Channel Encoder
Network Channel Decoder
Channel Decoder
Source Decoder
Source Decoder
Displayed VideoDisplayed VideoEnd-to-end Quality
f (source coding, channel loss, error concealment)
Loss due to error, buffer overflow, long delay
9/17/2003 ICIP 2003 4
Introduction Rate-distortion (RD) optimization
An efficient framework for error robustness.
Recursive optimal per-pixel estimate (ROPE) [ Zhang 2000] Account for all the relevant factors. Superior performance among existing schemes: high estimation accuracy and
low complexity. Frequently applied to R-D optimized mode selection in several video coding
frameworks.
R: Coding bit rate
D: End-to-end distortion
Accurate measurement
Trivial
Non-trivial
9/17/2003 ICIP 2003 5
Introduction Problems in applying ROPE
Not accommodate sub-pixel prediction More generally
Sub-pixel prediction Bi-directional prediction
for B and EP frames De-blocking filter Overlapped block motion
compensation, etc.
Pixel averaging Cross-correlationROPE
ROPEProhibitive
storage & comput.cost ?
• Low complexity • Accurate estimation
9/17/2003 ICIP 2003 6
Introduction One existent solution [Stuhlmuller, 2000]
If d(X, Y) < dmax , accurately estimate and store E{XY};
Otherwise, E{XY} = E{X}E{Y}.
Motivation: two distant pixels are less likely to be averaged in practice.
Greatly reduce the complexity.
Still need to additionally compute and store a substantial number of cross-correlation values in advance.
The uncorrelation assumption compromises the estimation accuracy.
9/17/2003 ICIP 2003 7
Introduction Our proposed solution in this work
Two cross-correlation approximation schemes stemming from two differing model assumptions.
Based on the marginal moments of pixels, which are available quantities in ROPE.
No additional storage space, and no redundant computation for possibly unused cross-correlation values.
The high estimation accuracy of ROPE is well maintained.
9/17/2003 ICIP 2003 8
Problems in Applying ROPE to Sub-pixel Prediction
Recursive optimal per-pixel estimate (ROPE) For Inter mode macroblock (MB):
Pixel i in frame n is predicted by pixel j in frame n-1. To conceal a lost frame, simply replace it with the previous reconstructed frame.
Sub-pixel prediction Interpolation from the original pixels of integer position
Improve the performance of motion compensated prediction. H.263: half-pixel; H.264: quarter-pixel or even higher accuracy.
})~
{(})~
ˆ{()1(})~
{(
}~
{)}~
ˆ{()1(}~
{2
12
12
11
in
jn
in
in
in
jn
in
in
fpEfeEpfE
fpEfeEpfE
9/17/2003 ICIP 2003 9
Problems in Applying ROPE to Sub-pixel Prediction
Half-pixel prediction in H.263
Assume pixel i in frame n is predicted by a half pixel in frame n-1, e.g. b, then:
A B
C D
a b
c d
Integer pixel position
Half pixel position
a=Ab=(A+B+1-CTRL)/2c=(A+C+1-CTRL)/2d=(A+B+C+D+2-CTRL)/2
2/]1}~
{}~
{[}~
{ CTRLBEAEbE
.4/}]~~
{2}~
{}~
{
})~
{}~
{()1()1[(}~
{22
22
BAEBEAE
BEAECTRLCTRLbE
Inter-pixel cross-correlation
Control parameter: 0 or 1
9/17/2003 ICIP 2003 10
Problems in Applying ROPE to Sub-pixel Prediction
Inter-pixel cross-correlation Essentially, its presence is due to the pixel averaging operation, which
appears in many common techniques of video coding standards.
Exact computation of the needed cross-correlation for the current frame may require the availability of all the cross-correlation terms in previous frames.
This entails too much complexity for practical video coding systems. (E.g. assuming 4 bytes per value and QCIF, we need 2.4GB to store the cross-correlation terms for accurate distortion estimation.)
9/17/2003 ICIP 2003 11
Model-based Cross-correlation Approximation
Basic idea Approximate the cross-correlation between two pixels by a function of
the available 1st and 2nd order marginal moments. Consequently, no additional storage requirements, and minimum
additional computation complexity ( as the computation occurs only when a specific cross-correlation is needed ).
Problem formulation
Approximate E{XY}, given E{X}, E{Y}, E{X2}, E{Y2}.
9/17/2003 ICIP 2003 12
Model-based Cross-correlation Approximation
Model-based cross-correlation approximation
YXYEXEXYE }{}{}{
Model I
X = a + bY, where a,b are unknown constants, b0.
Model II
X = N + bY, b is constant. N is a zero-mean random variable, and is independent of Y.
}{}{
}{}{ 2YE
YE
XEXYE
}{}{ YEXEYX
, with
X,Y are two pixels.
Specify which of the two pixels is X or Y.
Unsymmetric
9/17/2003 ICIP 2003 13
Model-based Cross-correlation Approximation
Useful bounds To further limit the propagation of estimation error.
General bound for each involved quantity Obvious fact: The pixel value is within the range of 0~255.
Bound for cross-correlation
}{}{}{ 22 YEXEXYE Schwarz Inequality:
9/17/2003 ICIP 2003 14
Simulation Results
Simulation settings: UBC H.263+ codec
Encoder Given total bit rate and packet loss rate. Half-pixel prediction is employed.
Decoder Averaging PSNR over 50 packet loss patterns generated under the same
packet loss rate.
9/17/2003 ICIP 2003 15
Simulation Results Tested methods
“Model I”, “Model II” “Model 0”
Uncorrelation model: E{XY}=E{X}E{Y} “Full Pel” method in original work of ROPE
Approximate half-pixel prediction simply by integer pixel prediction. “Actual”
Real average PSNR result at the decoder. “Performance Bound”
ROPE with integer pixel prediction demonstrates the best estimation accuracy of ROPE.
9/17/2003 ICIP 2003 16
Simulation Results
20
22
24
26
28
30
32
34
36
38
1 51 101
Frame No.
PS
NR
(d
B)
Actual Model I Model II Model 0 Full Pel
Foreman, QCIF, 30f/s, 200kb/s, 1st 150 frames, p = 5%, periodic Intra-update.
Distortion estimation performance
“Model II” has the best end-to-end distortion estimation accuracy.
9/17/2003 ICIP 2003 17
Simulation Results
0
0.2
0.4
0.6
0.8
1
1.2
5 10 15 20 25 30
Packet Loss Rate (%)
PSN
R D
iffe
renc
e (d
B)
Model I Model II Model 0 Performance Bound
Distortion estimation performance comparison (cont.)
Foreman, QCIF, 30f/s, 200kb/s, 1st 150 frames, periodic Intra-update.
In spite of the simplicity of the linear model, “Model II” approaches the performance bound of ROPE very closely.
Both proposed methods achieve better estimation accuracy than that of the “Model 0” method.
9/17/2003 ICIP 2003 18
Simulation ResultsPerformance improvement comparison
0
0.2
0.4
0.6
0.8
1
5 10 15 20 25 30
Packet Loss Rate (%)
PSN
R G
ain
(dB
)
Model I Model II Model 0 Full Pel
0
0.2
0.4
0.6
0.8
1
1.2
5 10 15 20 25 30
Packet Loss Rate (%)
PSN
R G
ain
(dB
)
Model I Model II Model 0 Full Pel
Foreman, QCIF, 30f/s, 200kb/s, 1st 150 frames, RD optimized Intra-update.
Miss_am, QCIF, 30f/s, 100kb/s, 1st 150 frames, RD optimized Intra-update.
Performance gain of half-pixel prediction over integer pixel prediction in the case of RD optimized INTRA updating
Both proposed approximation schemes consistently achieve better performance gains than the other two methods.
9/17/2003 ICIP 2003 19
Conclusions
Two model-based schemes to approximate the cross-correlation with the available quantities from ROPE.
Low complexity & High estimation accuracy
The practical applicability of ROPE is significantly enhanced.