Fast 2D to 3D Conversion Using Wavelet Analysis CHAPTER 1 INTRODUCTION There are several technologies regarding the conversion of 2-D contents for 3-D TV systems, for example, Philips WOWVX system. In the system, a 3-D data representation which includes the traditional 2-D images and their associated per-pixel depth maps is adopted. The depth maps associated with X-Y information can be used to describe the spatial location of each point in the images. These data are processed by customized DSP and optical devices to emit rays into our eyes as stereoscopic images. The key problem rest in the above system is how to obtain the depth information from the 2D data. Recently a new technology called Depth Image-Based Rendering (OlBR) has been applied to the advanced 3-D TV system. One method to obtain a relative depth map from a single image using wavelet analysis and edge defocus estimation based on Lipschitz exponents was proposed in. Images were handled as series of 1-0 row signals, with the resulting horizontal stripes in the depth map. The depth map is further optimized and smoothed based on color segmentation to obtain much more accurate and reliable results. In this paper, a more simple approach is proposed to obtain the depth map of an image. In our approach, each image is first transformed to the grayscale Dept of DECS 1
This project proposes a depth estimation method which converts two-dimensional images of limited depth of field (DOF) into three-dimensional data. The goal is to separate the focused foreground objects from the blurred background objects in an image. Our approach is based on two observations: (1) the focused objects on an image of limited DOF correspond to the objects with high frequency; (2) the high-frequency area of an image appears high energy on its high frequency wavelet sub bands. In our approach, each image is first transformed to grayscale image then further transformed to the wavelet domain. Afterwards, the high frequency area of an image can be obtained from analyzing the high-frequency wavelet sub bands of the image. Finally, binarization and smoothing techniques are applied to find the position of the focused objects on the image. The experimental result demonstrates the effectiveness of our approach.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Fast 2D to 3D Conversion Using Wavelet Analysis
CHAPTER 1
INTRODUCTION
There are several technologies regarding the conversion of 2-D contents for 3-D
TV systems, for example, Philips WOWVX system. In the system, a 3-D data
representation which includes the traditional 2-D images and their associated per-pixel
depth maps is adopted. The depth maps associated with X-Y information can be used to
describe the spatial location of each point in the images. These data are processed by
customized DSP and optical devices to emit rays into our eyes as stereoscopic images.
The key problem rest in the above system is how to obtain the depth information
from the 2D data. Recently a new technology called Depth Image-Based Rendering
(OlBR) has been applied to the advanced 3-D TV system. One method to obtain a relative
depth map from a single image using wavelet analysis and edge defocus estimation based
on Lipschitz exponents was proposed in. Images were handled as series of 1-0 row
signals, with the resulting horizontal stripes in the depth map.
The depth map is further optimized and smoothed based on color segmentation to
obtain much more accurate and reliable results. In this paper, a more simple approach is
proposed to obtain the depth map of an image. In our approach, each image is first
transformed to the grayscale imaging color .Afterwards; the high-frequency area of an
image can be obtained from analyzing the high-frequency wavelet sub bands of the
image. Finally, binarization and smoothing techniques are applied to find the position of
The multi resolution wavelet transform has been shown to be an effective
technique and achieved very good performance for texture analysis. An image can be
decomposed into its wavelet coefficients by using Mallat's pyramid algorithm. After
wavelet decomposition, the object image energy is distributed in different sub bands, each
of which keeps a specific frequency component. In other words, each sub band image
contains one directional feature. The wavelet decomposition is illustrated in Fig. 6.1.
Given an image (see Fig. 6.1 (a)), four sub images (see Fig. 6.2(b)), i.e. DC-component
(upper left), H-component (upper right), V-component (lower left), and D-component
(lower right), of the image can be obtained after the wavelet decomposition. Here H, V
Dept of DECS 17
Fast 2D to 3D Conversion Using Wavelet Analysis
and 0 are used to indicate horizontal, vertical and diagonal, respectively. From Fig. 6.1
(b), it can be found that horizontal edges, vertical edges and diagonal edges of the image
can be obtained from the wavelet decomposition of the image.
Fig 6.1(a) A test image and (b) its wavelet decomposed image (or sub images)
Fig 6.2 Illustration of the proposed wavelet-based edge detection method:(a) The horizontal component,(b) The vertical component,(c) The diagonal component, and(d) The combined result of the image given in Fig 6.1.
6.3 RELATED WORK
For images of limited depth of field (DOF), the main foreground objects are
focused with sharp edges and the objects in the background are blurred. In other words,
the high frequencies are retained in the focused foreground, but greatly attenuated in the
Dept of DECS 18
Fast 2D to 3D Conversion Using Wavelet Analysis
background. This suggests that the spatial frequency is directly related with the degree of
blurring, and thus the relative distance of the object from the camera. The high
frequencies can be described by the coefficients of the wavelet transform of the image. If
there is larger energy in the wavelet bands of high frequency, it suggests that there are
more details and less blurring in this region, where the 3-D location is nearer. The
elementary relative depth can be estimated based on the values of wavelet coefficients in
the high frequency bands. Based on this, divide the images into macro blocks of size are
16-pixel by 16-pixel. A macro block wavelet transforms which generated 256 wavelet
coefficients was performed. Relative depth was estimated by counting the number of non-
zero wavelet coefficients. A method to obtain a relative depth map from a single image
using wavelet analysis and edge defocus estimation based on Lipschitz exponents was
proposed in. Images were handled as series of 1-0 row signals, with the resulting
horizontal stripes in the depth map.
To overcome this issue, an incremental algorithm based on wavelet transform and
edge focus analysis in two-dimensions was proposed in, taking into account the direction
of edges and the two-dimensional characteristics of images. The depth map is further
optimized and smoothed based on color segmentation to obtain much more accurate and
reliable results.
Dept of DECS 19
Fast 2D to 3D Conversion Using Wavelet Analysis
CHAPTER 7
METHODOLOGY
The proposed depth map estimation algorithm is introduced in this section, which
can be summarized as the following steps.
Fig 7: Steps Involved in Depth map Estimation
7.1 Y COMPONENT EXTRACTIONS
Since the focused object on an image is the object with high-frequency (or fine
texture), the simplest way to distinguish the focused object from others is to analyze the
texture of the image. Moreover, the Y component of an image represents the overall
brightness (or luminance) of the image; the texture of the Y component of the image is
similar to that of the original color image. Therefore, in our approach, each image is first
Dept of DECS 20
2d Input Image
Y component extraction
Wavelet-based edge detection
Smoothing for edge
defocusing
First binarization
for edge enhancement
Second binarization
for noise Removing
3d Depth map
Fast 2D to 3D Conversion Using Wavelet Analysis
transformed from the standard RGB color space to the YUV space; then Y component of
the image is further transformed to the wavelet domain.
7.2 WAVELET-BASED EDGE DETECTION
As discussed before the depth of limited-DOF images can be measured by their
frequencies. In this step, we analyze the frequency energy based on the wavelet
transforms. Basically, the edges of the focused object appear high frequency energy. Each
pixel of the wavelet sub-band image corresponds to a wavelet coefficient. The larger the
value of a wavelet coefficient, larger is the energy within the corresponding pixel. The
values of the coefficients in the high frequency wavelet sub-bands (the H-component, V-
component, and D-component) show how much the details are not blurred, and therefore
give a relative depth value. The range of depth is adjusted from 0 to 255 (0 denotes black
and 255 denotes white in the depth map). Larger depth value indicates nearer in distance.
Since the wavelet analysis can extract the directional edges of an image easily, we
can obtain the overall edges of the image by merging its directional edges.
Given the test image shown in Fig. 6.1(a), three sub-images with different
directional edges, i.e. H-component (see Fig. 6.2(a)), V-component (see Fig. 6.2(b)) and
D-component (see Fig. 6.3(c)), of the image can be obtained after the wavelet
decomposition. By merging the three sub-images, we can obtain the overall edges of the
original image, which results in our initial depth map, as shown in Fig. 6.2(d).
Fig 7.1 (a) Input Image (b) its Y component (c) edges detected by wavelet-based approach
7.3 EDGE ENHANCEMENT BY BINARIZATION
Image binarization converts an image of up to 256 gray levels to a black and white
image. Frequently, binarization is used as a pre-processor before optical character
Dept of DECS 21
(a) (b) (C)
Fast 2D to 3D Conversion Using Wavelet Analysis
recognition (OCR). The simplest way to use image binarization is to choose a threshold
value, and classify all pixels with values above this threshold as white, and all other
pixels as black.
In our study, each pixel of the wavelet sub-band image corresponds to a wavelet
coefficient. The larger the value of the corresponding wavelet coefficient, larger is the
energy within the pixel. After the previous steps, the edges of the focused object appear
high-frequency energy and the values of the corresponding wavelet coefficients range
from 0 to 255. For the purpose of enhancing the important edges, we re-assign the value
of a wavelet coefficient to 255 if its original is larger than a particular threshold; and re-
assign it to 0, otherwise. Therefore, the pixels with high-frequency energy over the
threshold will be enhanced.
Fig 7.2 the initial depth map after binarization using varying threshold value: (a) T=10, (b) T=15 and (c) T=20.
7.4 NOISE DEFOCUSING BY SMOOTHING
Smoothing algorithms are often applied in order to reduce noise and/or to prepare
images for further processing such as segmentation. They can be broadly categorized into
linear and non- linear algorithms where the former are amenable to analysis in the Fourier
domain and the latter are not. For the implementation of the linear algorithm, the filter
can be based on a rectangular support or a circular support.
In order to remove the noises on the initial depth map, smoothing techniques are
used to defocus them. Here noise refers to a high-energy pixel whose neighboring pixels
consist of low energy. In our study, a uniform rectangular filter is adopted for smoothing
where the output image is based on a local averaging of the input filter and all of the
values within the filter support have the same weight. To do so, for each pixel of the
depth map, its value is re-assigned to the average of the values of the GxG pixels whose
Dept of DECS 22
(a) (b) (C)
Fast 2D to 3D Conversion Using Wavelet Analysis
center is the pixel. Note that the noises on the depth map can be further removed by the
subsequent binarization step.
Fig 7.3 the depth map after smoothing: (a) G=l, (b) G=2 and (c) G=3.
7.5 NOISE REMOVING BY BINARIZATION
After applying smoothing techniques on the depth map, the energy of the noisy
pixels can be reduced significantly. Therefore, the noises can be removed by the
binarization method, i.e. setting a threshold and remove those below the threshold. The
problem is how to select the correct threshold. In many cases, finding one threshold
compatible to the entire image is very difficult, and in many cases even impossible. In our
study, an optimal threshold will be examined through a series of experiments for the
illustrative image.
Fig 7.4 the depth map after smoothing and binarization: (a) G=1, T=0 (best), (b)
G=2, T=20 (best) and (c) G=3, T=50 (best)
Dept of DECS 23
(a) (b) (C)
(a) (b) (C)
Fast 2D to 3D Conversion Using Wavelet Analysis
CHAPTER 8
ADVANTAGES AND DISADAVNTAGES
8.1 ADVANTAGES OF DWT OVER DCT
1. No need to divide the input coding into non-overlapping 2-D blocks, it has higher
compression ratios avoid blocking artifacts.
2. Allows good localization both in time and spatial frequency domain.
3. Transformation of the whole imageà introduces inherent scaling.
4. Better identification of which data is relevant to human perceptionà higher
compression ratio
5. Higher flexibility: Wavelet function can be freely chosen
6. No need to divide the input coding into non-overlapping 2-D blocks, it has higher
compression ratios avoid blocking artifacts.
7. Transformation of the whole imageà introduces inherent scaling
8. Better identification of which data is relevant to human perceptionà higher
compression ratio (64:1 vs. 500:1)
8.2 DISADVANTAGES OF DWT
1. The cost of computing DWT as compared to DCT may be higher.
2. The use of larger DWT basis functions or wavelet filters produces blurring and
ringing noise near edge regions in images or video frames
3. Longer compression time
4. Lower quality than JPEG at low compression rates
8.3 FUTURE ENHANCEMENT OF WAVELET ANALYSIS
1. The combined use of wavelet transforms and Singular value decomposition (SVD)
gives the promising applications in finger print reading, mine detection.etc.
Dept of DECS 24
Fast 2D to 3D Conversion Using Wavelet Analysis
2. The application of wavelet transform to determine the type of fault and its
automation incorporating PNN could achieve an accuracy of 100% for all type of
faults. Back propagation algorithm could not distinguish all of phase-ground and
double-line to ground faults.
3. The application of wavelets as a possible vehicle for investigating the issue of
market efficiency in futures markets for oil.
4. The application of wavelet theory in modeling and analyzing economic data (and
phenomena) is still in its infancy and many properties of these models are not
explored yet in economic and finance literature.
8.4 APPLICATIONS OF WAVELET ANALYSIS
Wavelets are a powerful statistical tool which can be used for a wide range of
applications, namely
Signal processing
Data compression
Smoothing and image denoising
Fingerprint verification
Biology for cell membrane recognition, to distinguish the normal from the
pathological membranes
DNA analysis, protein analysis
Blood-pressure, heart-rate and ECG analyses
Finance (which is more surprising), for detecting the properties of quick variation
of values
In Internet traffic description, for designing the services size
Industrial supervision of gear-wheel
Speech recognition
Computer graphics and multifractal analysis
Many areas of physics have seen this paradigm shift, including molecular
dynamics, astrophysics, optics, turbulence and quantum mechanics.
Dept of DECS 25
Fast 2D to 3D Conversion Using Wavelet Analysis
CHAPTER 9
EXPERIMENTAL RESULT
In this preliminary experiment, a focused red flower with obscure background is
used as the illustrative example (see Fig. 7.1(a)). Fig. 7.1(b) shows the Y component of
the image. Since Y component is the luminance of the image, it looks like a gray level
image. Then the edges of the flower can be detected using wavelet analysis, as shown in
Fig. 7.1(c). The sensitivity of varying the values of threshold for binarization is next
investigated. Fig. 7.2 shows the initial depth map using varying threshold value, T, for
binarization. It can be found that the best result occurs when T=15 (see Fig. 7.2(b)).
Although the focused flower is separated from the background, some noises appear in the
contour of the flower. In what follows, we examine the impact of smoothing to the depth
map. Fig. 7.3 gives the depth map after smoothing. It can be found that the noise in the
depth map is defocused. Fig. 7.4(a) gives the depth map after smoothing and binarization,
using G=1 and varying threshold value T=0. Fig. 7.4(b) gives the depth map after
smoothing and binarization, using G=2 and varying threshold value T=20. Fig. 7.4(c)
gives the depth map after smoothing and binarization, using G=3 and varying threshold
value T=50. Fig.8.1 summarizes the best one out of each depth map, which results from
one value of 0 (in our case, G = 1, 2, or 3) and different values of threshold T. It is
observed that Fig. 8.1(c) gives the best performance. In other words, the optimal
parameter combination is G=3 and T=50, in an attempt to achieve the depth map for Fig.
3(a).
Fig 8.1 the resulting depth map: (a) G=1 and T=O, (b) G=2 and T=20, and (c) G=3
and T=50(best).
Dept of DECS 26
(a) (b) (C)
Fast 2D to 3D Conversion Using Wavelet Analysis
CONCLUSION
This paper proposes a depth estimation method which converts two-dimensional
images of limited depth of field (DOF) into three-dimensional data. The experimental
result shows that we can get simple depth maps easily through the wavelet analysis,
binarization and smoothing techniques. However, our approach has difficulty in face of
the DOF images with smoothed focused object. In such a situation, high-frequency
energy only lies in the edge of the focused object. To overcome this drawback, our future
work is to incorporate the color features. In addition, user-assisted workflow associated
with visual cues might be used to solve the problems.
Dept of DECS 27
Fast 2D to 3D Conversion Using Wavelet Analysis
REFERENCE
[1] A. Redert, R.P. Berretty, C. Varekamp, O. Willemsen, 1. SwiJlens, and H. Driessen,
"Philips 3D Solutions from Content Creation to Visualization,"The 3rd Int.
Symposium on 3D Data Processing, Visualization, and Transmission, pp.429-431,
June 2006
[2] Daubechies, I., "The Wavelet Transform, Time-Frequency Localization and Signal
Analysis," IEEE Trans. on Information Theory, vol. 36, pp.961-1005,1990.
[3] Mallat, S. “A Wavelet Tour of Signal Processing”, Academi Press, New York,1999.
[4] Walnut, D.F “An Introduction to Wavelet Analysis”,Birkhäuser, Boston, 2001.
[5] C Fehn, R.D.L. Barre, and S. Pastoor, "Interactive 3-DTV – Concepts and Key
Technologies," Proc. IEEE, vol. 94, no. 3, March 2006.
[6] S. A. Valencia, R. M. Rodriguez-Dagnino, "Synthesizing Stereo 3D Views from
Focus Cues in Monoscopic 2D images," Proc. SPIE, vol. 5006, pp.377-388, 2003.
[7] G. Gou, N. Zhang, L. Hou and W. Gao, " 20 to 3D Conversion Based on Edge
Defocus and Segmentation ", Proc. JCASSP, pp.2181-2184, 2008.
[8] A D. Bimbo,“Visual Information Retrieval”, San Francisco Morgan Kaufmann,