Top Banner
Final presentation spring 2015 ELIRAN COHEN MICHAEL RAPOPORT supervisor: INA RIVKIN Student s : Video manipulation algorithm on ZYNQ Part B
51

Final presentation winter 2014

Jan 04, 2016

Download

Documents

hisano

Final presentation winter 2014. Video manipulation algorithm on ZYNQ Part A. supervisor: INA RIVKIN. Students:. ELIRAN COHEN. MICHAEL RAPOPORT. Motivation. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Final presentation winter 2014

Final presentationspring 2015

ELIRAN COHEN

MICHAEL RAPOPORT

supervisor: INA RIVKINStudents:

Video manipulation algorithm on ZYNQ

Part B

Page 2: Final presentation winter 2014

Motivation

our goal is to build an embedded system which can receive video, process the video by hardware and software and finally send it out to a monitor.The system is based on the ZYNQ device of Xillinx .

embedded system

Page 3: Final presentation winter 2014

Project goal

ZedBoard

FMCmodule

• Add the ability to receive and draw out video signals to and from the embedded system.

• Add the ability to process the video signals by hardware, software and both.

HDMI IN

HDMI OUT

Page 4: Final presentation winter 2014

Background

The ZYNQ component

The board we are working on is called ZedBoard.The main device on our board is the ZYNQ.The ZYNQ consists of two main parts: the FPGA (programmable logic), and the ARM dual processor.

. We consider the above to be an embedded system

Page 5: Final presentation winter 2014

The HDMI Input/Output FMC ModuleThe FMC module enables us to receive and send the video HDMI data, the FMC module connects to an FMC carrier in the ZedBoard, and provides the following interfaces:

1 )HDMI Input 2) HDMI Output

3 )The interface for the ZedBoard

The interface for the ZedBoard

HDMIINPUT

HDMIOUTPUT

Page 6: Final presentation winter 2014

Work environment

Hardware design:• Planahead 14.4 – for Xillinx FPGA design• Xillinx Platform Studio )XPS)

Software design:• Xillinx Software Development Kit )SDK)

Debugging • SDK

Page 7: Final presentation winter 2014

At the previous project-part A

We built the embedded system, based on the ZYNQ device and the FMC module.

The system was able to receive and send video signals.

Input video signals passed through the FPGA components and through the ARM processor.

Video processing was by hardware and software Video manipulation was simple and done by the software.

Page 8: Final presentation winter 2014

The full system part A

Software blockHardware block

AXI

AXI

FM

C in

terf

ace

Videodetector

Videoresolution

Framebuffer

Videogenerator

HDMI out

HDMI in

Video out

Video in

VD

MA

Frame buffer

DDR

AXI4S_in

VTC_0

AXI4S_out

VTC_1

Page 9: Final presentation winter 2014

Project part B

Use the embedded we built at the previous project.

Perform complex processing using the software.

Perform a complex processing using the hardware.

Combine the two kinds of processing into single project.

Page 10: Final presentation winter 2014

Our system works with “luma” representation “YCbCr” instead of RGB in order to be more efficiant.

RGB pixel -> 32 bit

YCbCr pixel ->16 bit

Video color space

Page 11: Final presentation winter 2014

RGB format uses 32 bit for each pixel, each pixel is composed of 4 channels 8 bits each the firs three are the color channels the RED GREEN and BLUE colors

The fourth channel is the intensity (transparency) channel (0-transperent pixel, 1- opaque).

RGB format is divided into two main representations the alpha and beta, alpha is represented as we discussed above and beta is represented by already multiplied RGB channels in the intensity channel to achieve higher efficiency

RGB video format

Page 12: Final presentation winter 2014

This format is encoding of RGB format and the final pixel we see on screen depends on the interpretation we do to this format (hardware video out component)

8 lsbits are for Y component, that is the intensity component ( luminance)

8 msbits are for Cb and Cr component 4 bits each, they are the color components (chroma)

YCbCr video format

Page 13: Final presentation winter 2014

We build a (x,y) grid on the standard color map, the x,y axis's are controlled with the Cb and Cr components, so given CbCr we know the color.

Now we add z axis to set the brightness of the pixel, which is controlled by thy Y luminance components.

How does it works

Page 14: Final presentation winter 2014

Cb and Cr is the X and Y axises (respectively), as we can see at the figure below .

Z axis (luminance component) is directed inside the page

How does it looks like

Page 15: Final presentation winter 2014

The way our eye sees colors allows us to modify and manipulate pixels ,instead of the larger 32 bit RGB format, in the much smaller luma format (16 bit)

It allows us to achieve higher efficiency when trying to manipulate the pixel (or just streaming video frames) thus many displays use the luma format.

for more accurate manipulation it is possible to use rgb format but then we have to change the hardware components

*note: Xilinx offers suitable IP cores for example we will show the hdmi_out_rgb core

In conclusion

Page 16: Final presentation winter 2014

RGB Hardware componentsThis is the rgb

-> luma interpretation

Timing control

convert data from the

AXI4-Stream video

protocol interface to

video domain

interface

used to convert between different video formats

Interpretation from signals to

pixels

Page 17: Final presentation winter 2014

SOFTWARE

Page 18: Final presentation winter 2014

Software manipulationPressing the switches to freeze a video frame and pressing switches back to return to streaming.While frames are frozen we can manipulate them as much as we wantWe choose to manipulate the colors.

Page 19: Final presentation winter 2014

Micro controller manipulation

Page 20: Final presentation winter 2014

Software block diagram

Videodetector

Videoresolution

Framebuffer

Videogenerator

Software

Input video signals

Manipulation

block

output video signals

Page 21: Final presentation winter 2014

Software processing results

Each switch is responsible for one processing

Page 22: Final presentation winter 2014

How does it looks

Frames enter and leave the buffer

Frames in Frames out

Frame buffer

Page 23: Final presentation winter 2014

Freeze frame and process it

Frames being processed in frame buffer

Frames thrown

away

Frames out

frozenFrame buffer

Page 24: Final presentation winter 2014

Proceed frames sent to display

Frames thrown

away

Processed Frames frozen

Frame buffer

No frames in frame buffer

Page 25: Final presentation winter 2014

Inside the frame buffer we have up to 7 frames, each frame consists of 1920X1080 pixels.

We iterate over the frames then we iterate over each pixel to manipulate its data.

We build 4 manipulations (one for each switch), the manipulations are closing each stream(cr,cb), switching between the streams and switching between the intensity and the color(y,cbcr).

Software architecture

Page 26: Final presentation winter 2014

How does it look

We have our frames inside the Frame buffer

We iterate over each frame in series when frame is selected it is sent to the manipulation process

Page 27: Final presentation winter 2014

Manipulation processWe iterate over each pixel in the manipulation process

*note: picture not in scale

Page 28: Final presentation winter 2014

Pixel manipulationOur 16 bit pixel representation

8 bit Y intensity channel

4 bit Cb color channel

4 bit Cr color channel

At this point we have full access to the pixel information and we can manipulate the pixel by changing the information represented in these bits

Page 29: Final presentation winter 2014

1st manipulation

Our 16 bit pixel representation

8 bit Y intensity channel

4 bit Cb color channel

4 bit Cr color channel

Cb channel all zeroes, y&Cr channel untouched

Page 30: Final presentation winter 2014

2nd manipulation

Our 16 bit pixel representation

8 bit Y intensity channel

4 bit Cb color channel

4 bit Cr color channel

Cr channel all zeroes, y&Cb channel untouched

Page 31: Final presentation winter 2014

3rd manipulation

Our 16 bit pixel representation

8 bit Y intensity channel

4 bit Cb color channel

4 bit Cr color channel

Cb and Cr channels are swapped, y channel untouched

Page 32: Final presentation winter 2014

4th manipulation

Our 16 bit pixel representation

8 bit Y intensity channel

4 bit Cb color channel

4 bit Cr color channel

Cb&cr channel are swapped with y channel

Page 33: Final presentation winter 2014

Our system clock 148.5 MHZAxi4 bandwidth is 32 bit I/OPixel represented in luma (YCrCb) is 16 bitSo totally every clk cycle 2 pixels enter and leave the

cpuOur frame size is 1920X1080

So frame frequency is

Processing speed

62 10 6

6

150 10 2150[ ]

2 10Hz

Page 34: Final presentation winter 2014

Trying to do so is equally implementing a single core GPU

More elaborate explanation: each frame has 1920X1080 pixels and the buffer has up to 15 frames,

the minimum is 3 frames so total amount of pixels to manipulate 6e6

software processing is actually iterating over each pixels on every cycle so even without taking into consideration architecture penalties such as branch miss predictions on every end of loop(about 6e3 outer loops) and tacking into account multi scalar pipe we still have e6 iterations to do in short time.

Impossible to manipulate video on the fly

Page 35: Final presentation winter 2014

time between each and every two frames (1/150)sec

Our CPU works at ~1GHZ so each cycle takes ~1 nsec

Each iteration consists of at list 100 cycles (again assuming best case scenario that is all data and instruction in L1 cache for instance ), so manipulating the whole buffer will take almost 1sec while we have only 1/150 sec

Impossible to manipulate video on the fly

Page 36: Final presentation winter 2014

Can only manipulate single frame each time

Need to stop streaming

May be useful for some applications

For processing HDMI video on the fly we have only one solution: Use Hardware, FPGA

Problems & solutions

Page 37: Final presentation winter 2014

HARDWARE

Page 38: Final presentation winter 2014

Hardware manipulationThe hardware is able to achieve a real time manipulation of video streaming, the manipulation we choose to implement is color manipulation, which is based on our software manipulation.

Page 39: Final presentation winter 2014

as we saw it is impossible to process video in real time with only software running on our micro controller so we have to use hardware In order to achieve our goal

we would like to perform similar manipulation, which we tried to do with our software , in our hardware

The place on the embedded system we choose for our hardware manipulation is the HDMI_OUT block

Hardware manipulation

Page 40: Final presentation winter 2014

HDMI_OUT block manipulation

Page 41: Final presentation winter 2014

Similar to the software in our hardware we also have access to the whole pixel information (its bits).

Our manipulation is done by changing the wiring of each pixel using 3 video processing algorithms

The algorithms are Luma2RGB,RGB2RGB grey scale and RGB2Luma

We can achieve any real time color processing due to the use of wiring manipulation.

HDMI_OUT block manipulation

Page 42: Final presentation winter 2014

How does it look

Received luma

Received chroma

Arriving pixels buffer block)HDMI_OUT)

Signals in hexadecimal

Luma data

chromadata

Video_data signal[0-15] arriving from axi block

Video_data_d signal[0-15]leaving hdmi_out block

Page 43: Final presentation winter 2014

Adding manipulation

Arriving pixels

New buffer block)HDMI_OUT)

Video_data signal[0-15] arriving from axi block

Manipulation block

buffer block)HDMI_OUT)

Video_data_d signal[0-15]leaving hdmi_out block

Page 44: Final presentation winter 2014

Manipulation block

Manipulation block

Colored pixelGreyscale pixel

luma2rgb rgb2lumargb2rgb grey scale

Page 45: Final presentation winter 2014

We use the ordinary equations to perform the transformation :

R´ = 1.164(Y - 16) + 1.596(Cr - 128) G´ = 1.164(Y - 16) - 0.813(Cr - 128) - 0.392(Cb - 128) B´ = 1.164(Y - 16) + 2.017(Cb - 128)

Luma 2 RGB

Page 46: Final presentation winter 2014

We use the ordinary equations to perform the transformation :

R = 0.3*R’ + 0.587*G’ + 0.115*B’ G = 0.3*R’ + 0.587*G’ + 0.115*B’ B = 0.3*R’ + 0.587*G’ + 0.115*B’

RGB 2 RGB grey scale

Page 47: Final presentation winter 2014

We use the ordinary equations to perform the transformation :

Y = 0.299*R + 0.587*G + 0.114*B Cb = (B-Y)*0.564 + 0.5 Cr = (R-Y)*0.713 + 0.5

RGB 2 Luma

Page 48: Final presentation winter 2014

Hardware processing results

Page 49: Final presentation winter 2014

Software VS. Hardware

Software: Can only manipulate single frame each time. Need to stop streaming. Not a real time processing.

Hardware: Can manipulate the arriving frames. Don’t need to stop streaming. A real time processing.

Page 50: Final presentation winter 2014
Page 51: Final presentation winter 2014

In our project we learned the correct way to work with embedded systems, the more powerful and less powerful sides of each component , for example we learned that for achieving real time processing of the data itself we must use the hardware components of our system, but for stream processing or data parking we can use our microcontroller and software processing

In conclusion