STreaming Day 2006 An Improved Context Adaptive Binary Arithmetic Coder for the H.264/AVC standard Simone Milani Dept. of Information Engineering -University of Padova – Italy E-mail: [email protected]
STreaming Day 2006
An Improved Context Adaptive Binary Arithmetic
Coder for the H.264/AVC standard
Simone Milani
Dept. of Information Engineering -University of Padova – Italy
E-mail: [email protected]
STreaming DayPisa, Sept. 11, 2006
2
Outline
Introduction to arithmetic coding
The H.264/AVC standard
Arithmetic coding of coefficients in the H.264/AVC standard
Modeling the probabilities through a Directed Acyclic Graph (DAG)
A Belief-Propagation based arithmetic coder
Experimental results
STreaming DayPisa, Sept. 11, 2006
3
Video coding
Video compression algorithms allow the transmission or the storage of video sequences. The winning scheme is the one that is able to keep the distortion of the reconstructed sequence as low as possible despite the number of coded bits is significantly reduced
The most recent compression standards resort to adaptive arithmetic coding as an efficient solution to reduce the size of the coded bit stream
During the last decades, different video coding algorithms have been designed in order to achieve higher and higher compression performancefor different applications.
Video coding
Video Storage
Web TV
Video surveillance
Video Communication
STreaming DayPisa, Sept. 11, 2006
4
The hybrid video coder H.264/AVC adopts two different entropy coding algorithms
The H.264/AVC video coding standard
De-blocking Filter
Motion Estimation
CAVLC
Scaling & Inv. Transform
Transform/ Scal./Quant.
Coder Control
Intra-frame Prediction
Motion-Compensation
Quant. Transf. coeffs
Motion data
Control data
+
++
-
Output video data
Inputslice(split into 16x16 macroblocks)
Output bitstream
CABAC
Context Adaptive Binary Arithmetic Coding
Context Adaptive Variable Length Coding
STreaming DayPisa, Sept. 11, 2006
5
Arithmetic coding
00
0a
0c
1
111a
1c0
71
0
112a
2c
744⋅
741⋅
00
3a
3c
74832⋅⋅
114a
4c748
11⋅⋅
748109
2 ⋅⋅
74888
2 ⋅⋅
In the arithmetic coding, each binary string is mapped into an intervalaccording to the probability of the symbols that were coded.
Ex. [01101]
with
P(0)=[1/7,1/4,3/8,1/8,1/4]
respectively
In the adaptive approach, the probabilities of the binary symbols (i.e. the lengths of the intervals) are updated according to the statistics of the input signal. Each binary symbol is associated with a context that identifies a distinct probability distribution.
ContextIdentification
ContextProbability
Arithmetic Coder
Binary symbol
update
STreaming DayPisa, Sept. 11, 2006
6
Context-Adaptive Binary Arithmetic CodingThe binarizer maps each syntax element into a variable-length binary string
The binarization allows a more efficient design of contexts
The binarizer also performs a partial “entropy coding” when the contexts have just been initialized and their probability distribution do not match the statistics of the input data yet.
The context modeler associates each symbol with a context that identifies a binary probability mass distribution
STreaming DayPisa, Sept. 11, 2006
7
CABAC for H.264/AVC DCT coefficients(1/2)
Zero coefficients and signs
Coefficients absolute values
Each block is converted into a sequence of quantized transform coefficients using a zig-zag scan. Then zero coefficients and the signs are coded separately from the coefficients absolute values.
0
1 1
0
1
10As for zeros and signs, couples or triplets of binary values are sent to the arithmetic coder.
( ))(,, optionalsln n: 1 if coeff. != 0
l: 1 if the following coeff. are 0
s: 1 if coeff. > 0
The context depends on the scanning position.
STreaming DayPisa, Sept. 11, 2006
8
CABAC for H.264/AVC DCT coefficients (2/2)
The absolute value of each non-zero coefficient is then binarized into a variable length string and each bit is sent to the binary arithmetic coder.
Binarizer
A context is associated to each bit according to its position in the string
ctx6ctx5ctx4ctx3ctx2ctx1ctx0
ctx
Each context identifies a distinct binary p. m. f.
)0(p )1(p
⇒
Note that contexts do not depend on the position
Arith. Coder
In our approach we skipped the binarization block
STreaming DayPisa, Sept. 11, 2006
9
The correlation among coefficients
After the transform operation, the coefficients need to be rescaled since the transform is not orthonormal. As a consequence of the approximation of the rescaling factors, the basis function are not perfectly orthogonal.
Moreover the adopted transform dimension has a lower decorrelating property with respect to the previous coding standard.
Therefore, the coefficients of the transform result correlated.
At the same time, the coefficients of adjacent blocks are correlated
It is possible to take advantage of this
correlation to improve the coding algorithm, i.e. the
probability estimation mechanism
STreaming DayPisa, Sept. 11, 2006
10
Modeling the bit probability through a DAG
•The correlation among different coefficient is well-modeled by a Directed Acyclic Graph (DAG)
•Each variable is associated with a coefficient in a 4 X 4 structure
•Each 4x4 structure can be associated either with the coefficients in a single transform block or with the coefficients of different blocks at equal frequencies in a macroblock.
In order to simplify the model, we adopted separate graphical models for different bit planes. This allowed the coder to deal with binary variables
Ising model
STreaming DayPisa, Sept. 11, 2006
11
The adopted model
The performance of the arithmetic encoder was improved by modeling the least significant bit-planes in a block with DAGs.
As for the most significant bit-planes, the use of DAGs is not convenient since it increases the overall computational complexity. Since the bits different from zeros are limited and sparse for the upper levels, the traditional approach works well.
STreaming DayPisa, Sept. 11, 2006
12
Ising Model
Through the Ising model it is possible to characterize the probability of a graph of binary variables.
Given πs the set of predecessors of s and θsz,xy=P(s=z/A=x,B=y) with
πs={A,B}, z,x,y={0,1} and s=a…p
)()/()/(),/(),/(),/(),/(),/()(
aPabPacPbafPgjkPhklPknoPlopPpaP
K
K =
BAssBAssBAssBAss
BAssBAssBAssyBAss
e
eap
spapbaP
pas
pass
⋅⋅+⋅−+⋅−⋅+−−
−⋅⋅+−−+−−⋅+−−−
⋅
=
=
∏
∏
=
=
11,1)1(11,0)1(01,1)1)(1(01,0
)1(10,1)1()1(10,0)1)(1(00,1)1)(1)(1(00,0)(
)/()()(
θθθθ
θθθθ
π
K
K
K
STreaming DayPisa, Sept. 11, 2006
13
Coding process: probability estimate
The probability estimate is performed through a message passing algorithm
probability of vertical predecessor
probability of horizontal predecessor
Note that zero coefficients are known by coding run values.
STreaming DayPisa, Sept. 11, 2006
14
Coding process: state transitions
In the old CABAC FSM the state transitiondepends on whether the coded bit corresponds or not to the most probable one (using a memory table).
In the new CABAC FSM, the state transition depends on the estimated probability. The new state is computed according to the equation
⎥⎦
⎥⎢⎣
⎢+⎟
⎠⎞
⎜⎝⎛= 5.0
0855.0)(log_ ixpstatenew
STreaming DayPisa, Sept. 11, 2006
15
MAP estimate of the probabilities
The adopted model requires the estimate of the conditioned probabilities (or the moments of the Ising models)
The moments are computed using a log-MAP estimate
The estimate can be done either off-line on a set of training sequences or on-line using an adaptive algorithm.
Off-line computation On-line computation
∑
∑
∑
∑
=
=
=
=
−=
−−=
−−=
−−−=
M
iiii
s
M
iiii
s
M
iiii
s
M
iiii
s
BAsM
BAsM
BAsM
BAsM
111,0
110,0
101,0
100,0
)1(1ˆ
)1()1(1ˆ
)1)(1(1ˆ
)1)(1)(1(1ˆ
θ
θ
θ
θ
iiiss
iiiss
iiiss
iiiss
BAsBAsBAs
BAs
)1()1(ˆˆ)1()1()1(ˆˆ
)1)(1()1(ˆˆ)1)(1)(1()1(ˆˆ
11,011,0
10,010,0
01,001,0
00,000,0
−⋅−+⋅←−−⋅−+⋅←
−−⋅−+⋅←−−−⋅−+⋅←
αθαθαθαθαθαθαθαθ
STreaming DayPisa, Sept. 11, 2006
16
Experimental results (1/2)
Results for the sequence “mobile” QCIF.
Parameter Value
GOP IPPP
GOP length 15
Frame rate 30 frame/s
RDOpt. Off
STreaming DayPisa, Sept. 11, 2006
17
Experimental results (2/2)
Results for the sequence “mobile” CIF. Results for the sequence “salesman” QCIF.
STreaming DayPisa, Sept. 11, 2006
18
Conclusions and future work
Two coding algorithms were designed. The first was based on the correlation among the coefficients of a block, the second was based on the correlation among the coefficients of a whole macroblock (but in different blocks).
The reduction in the coded bit stream size for the “mobile” sequence was about 12 %.
Future WorkEstimation of the probabilities from a fixed set of models (found through an EM
procedure)
Application of the algorithm with different binarizations and context structures
Implementation of the whole procedure in a fixed point arithmetic
STreaming DayPisa, Sept. 11, 2006
19
Bibliography
[1] J. Yedidia,W. Freeman, and Y.Weiss, “Constructing free energy approximations and generalized belief propagation algorithms,” IEEE Trans. Info. Theory, no. 7, pp. 2282–2312, July 2005.
[2] M. I. Jordan and Y. Weiss, “Graphical models: Probabilistic inference,” in The Handbook of Brain Theory and Neural Networks, M. A. Arbib, Ed. MIT Press, 2002.
[3] Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, “Joint Final Committee Draft (JFCD) of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC)”, Joint Video Team 4th Meeting, Klagenfurt, Germany, Jul. 2002.
[4]Marpe, D.; Schwarz, H.; Wiegand, T. ,“Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard”, IEEE Trans. on CSVT, Vol. 13, Issue 7, Jul. 2003, pp. 620-636.
[5] Milani S.; Mian G. A.,“An Improved Context Adaptive Binary Arithmetic Coder for the H.264/AVC standard”, Proc. of the EUSIPCO 2006, Florence, Italy, Sept. 4-8, 2006