Dali Virtual Machine Wei Tsang Ooi (With Brian, Sugata, Tibor, Steve, Haye, Matthew, Dmitriy)
Dec 19, 2015
Dali Virtual Machine
Wei Tsang Ooi(With Brian, Sugata, Tibor, Steve, Haye, Matthew, Dmitriy)
Motivations
We know how to encode, transmit and store multimedia data today
People start looking into ways to process multimedia data
Video Editing
Concat, cross-fade, overlay, cut and paste
Gateway
Convert from one format to anotherChange compression rate
transcoder(format A) format Binput video output video
DBMS
“ Find all movies in the database that contain this image “
Set-top Boxes
Subtitling and close captioningMixing of incoming movies
Lecture Browser
Match video to slideSwitch between video streams
Current Solutions
Black box C code (mpeg_play) Hard to reuse/adapt/break apart Unpredictable performance
Standard library (PPM, IJG) Least common denominator (RGB for
all)Video frames --> RGB --> gray
History
Rivl - high level scripting language for multimedia processing
Still suffers from “black box” problem
Experience
Real cost of processing Inherently complex operations
Projective transform
Layered operationsVideo decoding
Simple operations but lots of dataCopy
Dali Architecture
Dali Code
Dali Compiler
DVM Code
Dali Virtual Machine
User Expert User
MMX TriMediaC
Today’s Talk
DVM Code
Dali Virtual Machine
DVM Code : Design Goal
Small number of composable abstractions
Simple, predictable opcodesHigh performanceTarget for compilerEasy to extend
Abstractions
Byte imageBitstreamMPEGJPEGBit masksPCM
Abstractions
Byte imageBitstreamMPEGJPEGBit masksPCM
Byte Image
Two dimensional array of byteCan be either physical or virtual
physicalimage virtual
image
Byte Image Can Represent
Gray scale imageRGB imageYUV imageAlpha channel
Example Code : Fractal
smaller = byte_new (neww, newh);byte_shrink_2x2(image,smaller);
image
smaller
neww
newh
image
target = byte_clip(image,0,0,dx,2*dy);byte_set(target,0);target = byte_clip(image,3*dx,0,dx,2*dy);byte_set(target,0);
targetdx
2dy
target = byte_clip(image, dx, 0, neww, newh);byte_copy(smaller, target);
target
image
smaller
newh
neww
dx
target = byte_clip(image, 2*dx, 2*dy, neww, newh);byte_copy(smaller, target);target = byte_clip(image, 0, 2*dy, neww, newh);byte_copy(smaller, target);
image
smaller
newwnewh
dx
2dy
DVM CodeDx = 0.25 * byte_width(image);dy = 0.25 * byte_height(image);neww = 0.5 * byte_width(image);newh = 0.5 * byte_height(image);smaller = byte_new (neww,newh);for (i = 0; i < n; i++) { byte_shrink_2x2(image,smaller);
Target = byte_clip(image,0,0,dx,2*dy);byte_set(target,0);target = byte_clip(image,3*dx,0,dx,2*dy);byte_set(target,0);target = byte_clip(image, dx, 0, neww, newh);
byte_copy(smaller, target); target = byte_clip(image, 2*dx, 2*dy, neww, newh); byte_copy(smaller, target); target = byte_clip(image, 0, 2*dy, neww, newh); byte_copy(smaller, target);}
General Dali Strategies
Specific instruction byte_shrink_2x2, byte_shrink_4x4, etc
Explicit memory allocation byte_new
Reduce data byte_clip
Composable abstraction byte image
Performance
Dali VM Rivl
Sparc 20 0.5 s 8.8 s
P2 266 0.3 s 4.0 s
run fractal with n = 4 on 800x600 gray scale image
about 21 frames/sec for 320x240 video
Abstractions
Byte imageBitstreamMPEGJPEGBit masksPCM
MPEG
Getting important and pervasiveComplex formatComplex code (mpeg_play)Most decoders provides
GetNextFrame() interfaceRandom access, direct transfer
difficult
Abstraction for MPEG Video
. . .seqhdr
gophdr gop
gophdr gop
seqend
Abstraction for MPEG Video
. . .seqhdr
gophdr gop
gophdr gop
seqend
. . .pichdr
pichdrpic pic
DVM Code Examples
Skip to the n-th frame
for (i = 0; i < n; i++) { mpeg_pic_hdr_find(bs); mpeg_pic_hdr_skip(bs);
}
DVM Code Examples
Skip to the n-th frame
for (i = 0; i < n; i++) { mpeg_pic_hdr_find(bs); mpeg_pic_hdr_skip(bs);
}
DVM Code Examples
Skip to the n-th frame
for (i = 0; i < n; i++) { mpeg_pic_hdr_find(bs); mpeg_pic_hdr_skip(bs);
}
DVM Code Examples
Skip to the n-th frame
for (i = 0; i < n; i++) { mpeg_pic_hdr_find(bs); mpeg_pic_hdr_skip(bs);
}
inbs1 inbs2
outbs
hdr hdr
hdr hdr
gop
gop
gop
gop
Concat 2 Video Sequences
DVM Code
Concat two sequences
While not end_of(inbs1) { gop_hdr_dump(inbs1, outbs); gop_dump(inbs1, outbs);}While not end_of(inbs2) {
gop_hdr_dump(inbs2, outbs); gop_dump(inbs2, outbs);}
Performance Analysis
Decode MPEG sequences to PPMNo outputRun on 3 different sequences
Performance
Rate (frame/sec)
Mpeg PlaySparc 20
Dali VMSparc 20
psycho(160x120)
27.3 27.9
bus(320x240)
8.8 9.9
tennis(320x240)
7.8 9.1
Performance
Rate (frame/sec)
Mpeg PlaySparc 20
Dali VMSparc 20
Dali VMP2 266
psycho(160x120)
27.3 27.9 112.1
bus(320x240)
8.8 9.9 36.0
tennis(320x240)
7.8 9.1 32.6
MPEG Strategies
Expose structure Gop header, gop instead of just frames
Break up layer Parse, decode, color space conversions
Show Time !
Decode a MPEG sequenceFor each frame scale it to “stamp”
sizeCreate a contact sheet
Abstractions
Byte imageBitstreamMPEGJPEGBit masksPCM
What Is Bitstream ?
Skip to the n-th frame
for (i = 0; i < n; i++) { mpeg_pic_hdr_find(bs); mpeg_pic_hdr_skip(bs);
}
Bitstream
parser1011001
chunk of memory
Mode of Operations
File i/ostatic memory
fread/fwrite
parser
Mode of Operations
Networkstatic memory
socket
parser
Mode of Operations
Memory map
parser
entire file
inbs videovideo videoaudio audio audio
mpeg system stream
filter
videobs
inbs videovideo videoaudio audio audio
mpeg system stream
filter
videobs
inbs videovideo videoaudio audio audio
parser
mpeg system stream
Bitstream Strategies
Predictable performance Explicit I/O
Recap: Current Solutions
Black box C code (mpeg_play) Hard to reuse/adapt/break apart Unpredictable performance
Recap: Cost of Processing
Inherently complex operations Layered operationsSimple operations but lots of data
How We Solve the Problems
Difficult to reuse/adapt/break apart Minimum set of abstractions Abstractions are simple Simple reusable opcodes
Unpredictable performance Explicit I/O Explicit memory allocations
How We Solve the Problems
Inherently complex operations Special case Clipping and masking
Layered operations Expose structure
Simple operations but lots of data Special case Clipping and masking
Extending with new formats : JPEG, WAV, AVI etc
TriMedia chip implementation
Application : lecture browser
Work in Progress
Future Work
Multithreading
MMX implementationCompiler