Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video CodingIole Moccagatta, PhDMultimedia Group, IMEC, Kapeldreef 75, B-3001, Leuven, [email protected]
ESTIMedia 2006October 26th, 2006
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Outline
• The MPEG-4 video standards• H.264/MPEG-4 AVC• Scalability in H.264 Ann. G/MPEG-4 SVC • MPEG-4 MVC: context, motivation, and coding
principles• FVV system and application scenarios• H.264/MPEG-4 AVC complexity• H.264 Ann. G/MPEG-4 SVC and MVC complexity• Impact on embedded platforms• Conclusions
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
MPEG, VCEG, and the Joint Video Team (JVT)
ISO/IECSC29/WG11
(MPEG)
ITU-TSG16/Q.6(VCEG)
MPEG-4
MPEG-7
MPEG-21H.261
H.262/MPEG-2
H.263
H.263+
……
JVT
H.264/AVC
SVC
MVC
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
The H.264/MPEG-4 Video Standards
• MPEG-4 Part 2– IS: 2004 (status: 3rd edition)
• ITU-T Rec. H.264/MPEG-4 Part 10 Advanced Video Coding (AVC)– IS: 2005 (status: 3rd edition)
• AVC Amd. 1: Support of colour spaces– FDIS: October 2006 (status: FPDAM)
• AVC Amd. 2: Advanced 4:4:4 profiles– FDIS: January 2007 (status: PDAM)
• ITU-T Rec. H.264 Ann. G/AVC Amd. 3: Scalable Video Coding (SVC)– FDIS: January 2007 (status: PDAM)
• AVC Amd. 4: Multi-view Video Coding (MVC)– FDIS: January 2008 (status: WD)
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/MPEG-4 AVC Codec Structure
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/MPEG-4 AVC: Motion Compensated Prediction
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Additional Features of Mot. Comp.
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/MPEG-4 AVC: Multiple Reference Frames
• Multiple picture buffer– FIFO or sliding window
– adaptive memory control
– 16 pictures max (memory is constrained)
• per-8x8 reference control• Bi-predicted picture: 2
sets of motion vector per block
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
New Types of Temporal Referencing
• Known dependencies (MPEG-1, MPEG-2, etc.)
• New dependencies– referencing order and display order are decoupled
• IBBPBBP.. vs. IBBPBBBBPBP... – referencing type and picture type are decoupled
• B frames can be used as reference
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
There is more ....
• More coding tools– Motion vector prediction using motion vectors from 4 neighboring blocks
– Adaptive Weighted prediction (generalized B slices)
• each prediction sample can be weighted• an offset can be added
– Interlaced coding
• field or frame coding• macroblock adaptive frame/field coding
– Context-adaptive Binary Arithmetic Codec (CABAC)
• More error resilience and network adaptation tools– Parameter set structure
– Network Adaptation Layer (NAL) syntax structure
– Arbitrary Slice Ordering (ASO)
– Data Partitioning (DP)
– Redundant Slices
– SP/SI synchronization/switching slices
• More sideband information– Supplemental Enhancement Information (pan-scan, cropping, etc.)
– Video Usability Information (aspect ratio of luma sample, overscan, etc.)
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/MPEG-4 AVC Coding Efficiency Performance
[Sullivan, SPIE, Aug. 2004]
Fig. 7: (a) – (e) Comparison of R-D curves for MPEG-2 (MP2), MPEG-4 Part 2 ASP (MP4 ASP) and H.264/AVC (MP4 AVC). I frames were inserted every 15 frames (N=15) and two non-reference B frames per reference I or P frame were used (M=3)
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Dramatic Increase of Heterogeneous Devices
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Solution: from Simulcast to Scalable Video Coding
00101101101001011010110101101100010110110010110111101010001011011000111011110011100110111001
00101101101001011010110101101100010110110010110111101010001011011000111011110011100110111001
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Requirement: (Extended) Spatial scalability
001011011010010110101101011011000101101100101101111010100010110110001110111100
11100110111001
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Requirement: Temporal Scalability
001011011010010110101101011011000101101100101101111010100010110110001110111100
11100110111001
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Requirement: Quality Scalability
001011011010010110101101011011000101101100101101111010100010110110001110111100
11100110111001
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
History and Current Status
66, BrisbaneOct ‘03
71, Hong KongJan ‘05
76, MontreuxApr ‘06
CfP[N5958]
WD
79, MarakechJan ‘07
FDIS
WaveletExploration Group
N8043
MPEG-21 Part 13
MPEG-4 Part 10 Amd 3
FDIS
• Temporal scalability:Hierarchical B-frames
• Spatial scalability: Layered approachESS
• Quality scalability: Layered approach for CGSMGSBitplane coding for FGS
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
SVC Block Diagram
spatialdecimation (2)
MC & Intraprediction
MC & Intraprediction
base layercoding
base layercoding
mux
inter-layer prediction techniques
texture
motion
texture
motion
0010110110100101101011010110110001011011001011011110101
MC & Intraprediction
base layercoding
inter-layer prediction techniques
texture
motion
progressive SNR refinement
texture coding
progressive SNR refinement
texture coding
H.264/AVC compatible bitstream
progressive SNR refinement
texture coding
spatialdecimation (2)
generates M (max=3) FGS layers
generates N CGS layers
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Temporal Scalability in MPEG-4 SVC: Hierarchical B-frames
key picture(IDR)
key picture
Group of pictures (GOP)
T2
T1
T0
T3
temporal layers
time
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Spatial Scalability in MPEG-4 SVC
inter-layer prediction techniques
spatialdecimation
MC & Intraprediction
MC & Intraprediction
entropycoding
entropycoding
multiplexer
0010110110100101101011010110110001011011001011011110101
texture
motion
texture
motion
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Inter-layer Prediction Techniques
• Layered approach
• Three techniques:– Inter-layer Intra Texture Prediction
• un-constrained (multiple loop decoding at target layer multiple mot. comp.) and constrained (single loop decoding)
– Inter-layer Motion Prediction
• macroblock partitioning, scaled motion vectors and reference indices of base layer are used in enhancement layer (base layer mode)
• for each motion vector a quarter-sample motion vector refinement is additionally transmitted and added to the derived motion vectors (1/4pel refinement mode)
– Inter-layer Residual Prediction
• only code the difference between current layer residual information and previous layer (up-sampled) residual information
• Some concepts already existed in MPEG-2/4 for spatial scalability
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Quality Scalability in MPEG-4 SVC: Coarse Grain Scalability
MC & Intraprediction
MC & Intraprediction
entropycoding
entropycoding
multiplexerinter-layer prediction techniques
0010110110100101101011010110110001011011001011011110101
texture
motion
texture
motion
QP2
QP1
QP2 < QP1
QP = Quantization Parameter
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Bringing it Together: Combined Scalability
no strict notion of layer
CIF@30Hz
QCIF@30HzCIF@15Hz
[Schwarz, ICME’05]
3D Scalability Space
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Embedded Scalability: Coding Efficiency Cost
[Schwarz, ICME’05]
The difference between the two
points is the cost of embedded scalability!
Note: the original target (from MPEG Req.) was 10% cod. eff. loss in exchange of
embedded scalability
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
MPEG-4 Multi-view Video Coding - Context and Motivations
2D color TV3D color TV
2D 3D
Interactive multi-view video
Passive single-view video
Passive Interactive
1-view N-view
Courtesy of Philips and The Matrix
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Interactive Multi-view Realization: Free View-point Video (FVV)
Courtesy of HHI and Microsoft Research
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Video Resource Management Info
MVC Video Elementary Stream Info
Timing Info
Camera Parameters Info
MVC Decoder
View Generation
Shared Memory
Video Resource Management Info
MVC Video Elementary Stream Info
Timing Info
Camera Parameters Info
MVC Decoder
View Generation
Shared Memory
• Basic components of an example FVV system
• Example architecture of a FVV decoder
Video Capture Correction MVC
EncoderMVC
DecoderView
Generation Display
FVV System and MVC Codec
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Application Scenarios
Entertainmente.g. concert, sport,
movie, game...
Educatione.g. instruction video, cultural archieves
Medical surgeryViewing with
exploration e.g.
museum, shopping
Surveillance
Immersive video conferenceAdvertisements
Event broadcasting
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Synchronized Multi-view Video Streams
T0 T1 T2 Tn-1 Tn Tn+1 Tn+2
Time
View
S0
S1
S2
SN
Multi-view image
Multi-view video
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Promising Coding Tools Currently Considered for MVC Standard
• Hierarchical B pictures for temporal dependencies and an adapted prediction scheme (HHI proposal)
• MVC encoder optimization• Block level illumination
compensation (5 competing proposals)
– Imperfectly calibrated cameras
– Different perspective projection direction
– Different reflection effects
• View synthesis prediction (2 different proposals)
Temporal Prediction
View Prediction
time
view
View Interpolation
View Warping
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Hierarchical B Pictures for MVC: Coding Structure (GOP size 8)
[Sm
olić
, JV
T-T
100,
July
2006]
temporal prediction
inte
r-vie
w p
red
icti
on
combined prediction
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Complexity Issues for H.264/AVC BP Decoder (1/2)
• Intra prediction– dependency of 4x4 prediction from previously reconstructed 4x4 blocks brakes
macroblock-based pipeline (scheduling issues)
– intra prediction requires pixel-level access
– heavy computational complexity (interpolation)
• Inter prediction– small blocks (2x2 chroma) increase memory bandwidth (small burst do not allow
to hide penalty for switching lines)
– small randomly fetches waste memory bandwidth for wide memory organization
– high computational complexity and increased memory bandwidth (4x4 -> 9x9) of 6-tap filter for ½ pel
• De-block filtering– complex state machine
– pixel-dependent computation
– independent luma and chroma computation requires fast engine
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Complexity Issues for H.264/AVC BP Decoder (2/2)
• Too many book-keeping/low bandwidth operations– variable geometry in neighboring blocks
– reference picture re-ordering
– etc.
• ASO and FMO– implement de-blocking as second pass (double speed, scheduling issues)
– linked list of buffers during bit stream parsing (delay)
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Complexity Issues for H.264/AVC BP Encoder
• Intra prediction– select best 4x4 requires to reconstruct previous blocks
• Inter prediction– select best geometry and motion vector precision requires testing (3+4*4)*3
cases
– short cuts are required (ex: select best mode based on 1 or ½ pel, them compute ¼ pel on winner, reduce search range, etc.)
– complexity increases linearly as function of multiple reference frames
• Coding gain vs. complexity tradeoff: adding more tools increase complexity while coding efficiency saturate
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/AVC BP Decoder: Complexity Estimation from Profiling
• Assumption: complexity measured as time complexity (i.e. number of operations required to execute a specific implementation of an algorithm)
• Profiling performed on 600-MHz P3 PC• Fare comparison is claimed
– Comparable SW optimization in both decoders
– Similar motion estimation and mode decision processes for both encoders
[Horo
witz,
IEEE C
SVT,
July
2003]
* I- and P-frames * UVLC (~CAVLC)* five ref. frames
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/AVC BP Decoder: Memory Requirement from Theoretical Evaluation
• Memory requirements (bytes)– Frame buffers (ex: reconstructed and ref. frames, etc.)
– Buffers storing one MB lines of MB (ex: intra prediction, etc.)
– Buffers @ MB level (ex: transform coeff., etc.)
– Constant data (ex: tables, etc.)
• Note: formulas are derived from algorithmic analysis
[Horo
witz,
IEEE C
SVT,
July
2003]
w = pict. width h = pict. heightn = # ref. frame
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
H.264/AVC BP Encoder: Complexity Estimation from Profiling
• Profiled on a reduced instruction set computing (RISC) platform (one PE, 1GHz Ultra Sparc II CPU and 8 Gbytes RAM)
• Disclaimer: non-optimized SW, no algorithm optimization (ex: integer-pel full-search ME)– Note: complexity analysis based on MPEG ref. SW (JM) typically overstates the actual
complexity of the H.264/AVC encoder by an order of magnitude, and that of the decoder by a factor of 2 to 3 [Shafer, EBU Tech Review, Jan ‘03]
• Requirements– memory transfer req. = 460 GB/s
– computational req. = 300 GIPS
[Chie
n,
IEEE C
om
m.
Mag
., A
ug.
05]
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Complexity Issues for MPEG-4 SVC (1/2)
• Inherited motion compensation memory access complexity from H.264/AVC – multiple ref. frames
– hierarchical B-frames
• Inter-layer prediction– un-constrained inter-layer intra texture
prediction: multiple motion compensation (multiple loop, one mot. comp./layer)
• motion compensation memory access complexity * (# layers), scaled by spatial decimation factor across layers
– inter-layer motion prediction: ok
– inter-layer residual prediction: memory access complexity to fetch lower layer’s residual
temporal scalability
spatial and quality (CGS) scalability
inter-layer prediction techniquesspatial
decimation
entropycoding
entropycoding
multiplexer
MC & Intraprediction
MC & Intraprediction
inter-layer prediction techniques
entropycoding
entropycoding
multiplexer
MC & Intraprediction
MC & Intraprediction
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Complexity Issues for MPEG-4 SVC (2/2)
• Efforts to reduce complexity are under way– multi loop vs. single loop (# of MCs loops): recently removed from the
standard to reduce decoder complexity
– Motion Compensated Temporal Filtering (MCTF) = hierarchical B-frames + update step
• removed update step at the decoder side removed requirement support for MCTF at the decoder side
• may still be used as pre-processing and/or enhancement tool at the encoder side
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Complexity issues for MPEG-4 MVC
• Inter-viewpoint prediction + inter-viewpoint & temporal prediction– introduced to enhance compression of the simultaneous and multiple video streams by exploiting
inter-viewpoint interpolation
– add-ups to “classical” temporal interpolation
– stress memory bandwidth requirements: think H.264/AVC B-frames, but coming from neighboring view as wall
• memory hierarchy is very important!
• Intermediate view synthesis– required by FVV system that use MPEG-4 MVC as key coding technology
– against traditional encoder vs. decoder complexity distribution
– new technology for consumer market apps. where cost is key factor for success
• new solutions are needed!
• Simultaneous decoding of multiple frames to enable “Matrix-like bullet time” visual effects
• Efforts to reduce complexity are under way– algorithm’s complexity reduction
• ex: simplified prediction structures to reduce the number of reference candidates– speed-ups approaches
• ex: speed-up of block-level illumination compensation
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Impact on Embedded Platforms (1/3)
• Memory access complexity (i.e. memory bandwidth)– sources of the increased complexity:
• multiple ref. frames• hierarchical B-frames• inter-layer prediction• inter-viewpoint prediction
– impact of memory access on energy: computation doesn’t cost much, but bandwidth feasts on energy [Wilson, EDN, Sept. 06]
• How to address memory bandwidth requirements – reduce data transfer between processing components, and reduce storage
requirements
• ex1: optimal memory organization/hierarchy• ex2: memory organization that increases locality maximize
$ performances– Data Transfer and Storage Exploration (DTSE) approaches can help to
investigate space of possible solution and find the best trade-offs
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Impact on Embedded Platforms (2/3)
• Throughput requirement from high spatial resolution
Operation frequency of a generic algorithmic component, assuming efficiency/processing capability of this component of 1 clk./sample
• Computational power (OPS) requirement from algorithmic complexity– Ex: 300 GIPS from H.264/AVC BP Encoder at CIF 30fps
1 clk./sample = 1.5clk./pixel1.5clk./pixel * #MB/sec ex: 1.14 MHz @ CIF
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Impact on Embedded Platforms (3/3)
• How to address joint throughput and computational power requirements:– can not satisfy these requirements with Task Level Parallelism (TLP, i.e. functional
split of the pipeline) alone
• Data Level Parallelism (DLP, such as inner loop-level parallelism) and/or Instruction Level Parallelism (ILP, i.e. VLIW) is a must have
– DLP ex: multimedia instruction set extensions (e.g., Intel’s MMX and SSE)• H/W acceleration is a maybe
– ex1: Application Specific Instruction Set Processor (ASIP)– ex2: H/W for CABAC, CA-VLC, etc.
– multi-core architectures to beat energy/power consumption
• load balancing is key issue (see clock islands in H/W)– need clever partition
» make full use of all the resources» switch-off what is not needed when is not needed
– design vs. run time approach vs. combined approach» design time: TL and DL parallelism» run time: RTOS with multi-threading
"Von Neumann is a poor use of scaling — all the energy is going on the communication between the processor and the memory. It’s much better to use 20
microprocessors running at 100MHz than one at 2GHz"[Hugo De Man]
Recent Developments in Video Compression Standards and their Impact on Embedded Platforms: from Scalable to Multi-view Video Coding
EstiMedia 2006 - © imec 2006
Conclusions
• H.264/MPEG-4 Part 10 AVC• SVC extension of H.264/MPEG-4 AVC (FDIS Jan ’07)• MVC extension of H.264/MPEG-4 AVC (FDIS Jan ’08)
• Memory access complexity significantly increased due to frame/layer/view-point prediction– need to minimize data transfer between processing components as well as
storage requirements
• Computational complexity increased due to combination of high spatial resolution and algorithmic complexity of codec’s tools– DLP and ILP are a must have
– H/W acceleration may be necessary
– multi-core architecture to reduce power