Top Banner
Research of Smoke Detection on Visual Saliency Method Junling Liu College of Computer Science and Technology, Jilin University, Changchun, China, 130012 Department of Information Engineering, Jilin Teacher’s Institute of Engineering and Technology, Changchun, China, 130052 Email [email protected] Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding author, Email: [email protected] AbstractSmoke detection is the key to the early warning of the fire, and it is hard to reach the unified standard of the smoke detection because of different environments and different combustions. Considering the continuity of the occurrence of smoke and the more obvious visual saliency along with the long-time integration, the paper proposes the algorithm of multi-step accumulation of inter-frame difference in order to rapidly find out the regions in which moving targets in video can appear, which can reduce the detection range. In the small matrix in the motion region, the matrix of low rank and sparse decomposition are adopted to separate the moving foreground object from the background. In the complex outdoor scene, the smokes drift motility and the colors translucence are more obvious, and the smoke target can be locked by means of the growth for all motion region and the saliency detection in HSV color space. The experiment compares the current mainstream salient algorithms which are applied to the smoke detection. The method of detecting speed and accuracy which is used in the paper has achieved a good effect. The method can be applied in different video scenes, even in the low-resolution and strong-noise scenes, it can also achieve a better detection result. Index TermsSmoke Detection; Saliency; Accumulation of Inter-Frame Difference; Sparse Decomposition; Motion Region; Regional Growth I. INTRODUCTION Smoke is the sign of fire and accompanied with fire. It can make up limitations of the traditional smoke detector and improve the pre-warning ability of fire monitoring by using visual saliency to realize rapid smoke detection in complex scenes such as in meadows, forests and tunnels. In complex scenes, smoke is a kind of diffused turbulence that not only has abundant motion morphologies and size changes, but also has such visual features as flashing and background blur. The smoke detection technology based on visual saliency can improve the real-time performance and reliability of smoke detection. At present, scholars at home and abroad have made a lot of researches: prior methods based on such features as low-level features of color, texture and contour edges [1-3], but due to the diversity of fire derivatives and the complexity of fire scenarios, smoke visual features have variability that makes a higher false alarm rate and omission ratio of smoke detection. The current methods [4-5] such as Fourier transform and wavelet transform analyze images from both frequency domain and space domain and detect the smoke in videos, but frequency domain analysis method always aims to a particular form of smoke and it’s hard to satisfy the application requirements of some certain occasions. In video frames, by using such features as information redundancy between images, smaller changes of background images, and specific regulations existing in smoke motion, Yuan et al. [6] put forward a smoke target detection algorithm in the model of cumulative direction of motion through the rapid evaluation of directions of smoke motion. And through calculating the optical flow in scenes, Kopilovic et al. [7] found that the optical flow motion features of targets can distinguish smoke from the targets without these motion features. However, the accuracy of optical flow calculation, the imaging conditions of monitoring areas, etc. have a great influence on the accurate test results of smoke. In recent years, with the development of computer vision, using salient methods introduced fire detection will greatly improve the detection efficiency [8]. In video frames, moving targets are more likely to be concerned, and adopting the method of visual saliency to detect the smoke targets with both motion and special visual features can narrow the detection area and improve the detection speed. Zhou and Hou et al.[9] adopted the visual saliency algorithm of frequency domain analysis and used multi-frames Fourier spectrum phase difference to find moving targets in the dynamic background, and this method has a simple calculation method and fast calculation speed, but the contour of the detection salient targets is blurry and the detection accuracy is not high when this method is used for the detection with the setting image resolution of 120*160; Xue Y et al. [10] used the separated foreground with low rank approximation and sparse decomposition to move objects in the background, and they adopted space information to JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 781 © 2014 ACADEMY PUBLISHER doi:10.4304/jmm.9.6.781-788
8

Research of Smoke Detection on Visual Saliency …...Email [email protected] Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

Jul 07, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

Research of Smoke Detection on Visual Saliency

Method

Junling Liu College of Computer Science and Technology, Jilin University, Changchun, China, 130012

Department of Information Engineering, Jilin Teacher’s Institute of Engineering and Technology, Changchun, China,

130052

Email [email protected]

Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012

*Corresponding author, Email: [email protected]

Abstract—Smoke detection is the key to the early warning of

the fire, and it is hard to reach the unified standard of the

smoke detection because of different environments and

different combustions. Considering the continuity of the

occurrence of smoke and the more obvious visual saliency

along with the long-time integration, the paper proposes the

algorithm of multi-step accumulation of inter-frame

difference in order to rapidly find out the regions in which

moving targets in video can appear, which can reduce the

detection range. In the small matrix in the motion region,

the matrix of low rank and sparse decomposition are

adopted to separate the moving foreground object from the

background. In the complex outdoor scene, the smoke’s

drift motility and the color’s translucence are more obvious,

and the smoke target can be locked by means of the growth

for all motion region and the saliency detection in HSV

color space. The experiment compares the current

mainstream salient algorithms which are applied to the

smoke detection. The method of detecting speed and

accuracy which is used in the paper has achieved a good

effect. The method can be applied in different video scenes,

even in the low-resolution and strong-noise scenes, it can

also achieve a better detection result.

Index Terms—Smoke Detection; Saliency; Accumulation of

Inter-Frame Difference; Sparse Decomposition; Motion

Region; Regional Growth

I. INTRODUCTION

Smoke is the sign of fire and accompanied with fire. It

can make up limitations of the traditional smoke detector

and improve the pre-warning ability of fire monitoring by

using visual saliency to realize rapid smoke detection in

complex scenes such as in meadows, forests and tunnels.

In complex scenes, smoke is a kind of diffused turbulence

that not only has abundant motion morphologies and size

changes, but also has such visual features as flashing and background blur. The smoke detection technology based

on visual saliency can improve the real-time performance

and reliability of smoke detection. At present, scholars at

home and abroad have made a lot of researches: prior

methods based on such features as low-level features of

color, texture and contour edges [1-3], but due to the

diversity of fire derivatives and the complexity of fire

scenarios, smoke visual features have variability that

makes a higher false alarm rate and omission ratio of

smoke detection. The current methods [4-5] such as

Fourier transform and wavelet transform analyze images from both frequency domain and space domain and detect

the smoke in videos, but frequency domain analysis

method always aims to a particular form of smoke and

it’s hard to satisfy the application requirements of some

certain occasions. In video frames, by using such features

as information redundancy between images, smaller

changes of background images, and specific regulations

existing in smoke motion, Yuan et al. [6] put forward a smoke target detection algorithm in the model of

cumulative direction of motion through the rapid

evaluation of directions of smoke motion. And through

calculating the optical flow in scenes, Kopilovic et al. [7]

found that the optical flow motion features of targets can

distinguish smoke from the targets without these motion

features. However, the accuracy of optical flow

calculation, the imaging conditions of monitoring areas, etc. have a great influence on the accurate test results of

smoke. In recent years, with the development of

computer vision, using salient methods introduced fire

detection will greatly improve the detection efficiency [8].

In video frames, moving targets are more likely to be

concerned, and adopting the method of visual saliency to

detect the smoke targets with both motion and special

visual features can narrow the detection area and improve the detection speed. Zhou and Hou et al.[9] adopted the

visual saliency algorithm of frequency domain analysis

and used multi-frames Fourier spectrum phase difference

to find moving targets in the dynamic background, and

this method has a simple calculation method and fast

calculation speed, but the contour of the detection salient

targets is blurry and the detection accuracy is not high

when this method is used for the detection with the setting image resolution of 120*160; Xue Y et al. [10]

used the separated foreground with low rank

approximation and sparse decomposition to move objects

in the background, and they adopted space information to

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 781

© 2014 ACADEMY PUBLISHERdoi:10.4304/jmm.9.6.781-788

Page 2: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

preserve the integrity of the detected moving objects.

This method is suitable for different video scenes and

even the scenes with low resolution and strong noise can

get good detection effects, but the calculation of this

method has a higher complexity. As the motion changes

of monitoring background in complex fire scenes are

small, and most early smoke detections occur in parts of the scenes with the rising and diffused features, this

research uses frame difference background method to

extract the moving salient region, and in consideration of

such factors as the continuity and more and more obvious

visual saliency with the time accumulation of smoke, the

research puts forward the multi-step accumulation of

inter-frame difference algorithm to rapidly find out the

regions with moving targets in video, which can reduce the data volume of image processing. In the motion

salient region, use the method of sparse and low rank

matrix decomposition to find out the motion foreground

and lock targets by the color and motion features of

smoke. This paper is divided into five parts, and the

second part introduces the extraction of salient regions by

using the method of multi-step accumulation of inter-

frame difference; the third part introduces the extraction of moving objects based on salient regions; the fourth

part is about the smoke detection based on low-level

visual smoke and the smoke targets detection method in

videos; and the fifth part is the experimental study.

II. EXTRACTION OF SALIENT MOTION REGIONS

A. Multi-step Accumulation of Inter-Frame Difference

Algorithm

The traditional adjacent frame difference method [11]

uses subtraction of frames to judge whether there’re

moving objects in image sequences through threshold

values. This method is not very sensitive to light rays and

other scene changes and it is able to adapt to various dynamic environments with good stability, but this

method can’t extract the complete region of the objects

and it only can extract borders; background subtraction

method [12] firstly selects the average of one or several

images from the background as the background image,

and if the pixel number got from the background

subtraction is large than one threshold, it can be judged

that there are moving objects in the monitored scene. This method with the simple design and fast calculation speed

can reflect the location, size, shape and other information

of moving objects and it is able to get more accurate

information of moving objects, but it can be greatly

affected by changes of external conditions, such as light

rays and weather. With simple algorithm implementation

and small calculated amount and in view of the

advantages and disadvantages of the above algorithm, this paper combines the background subtraction method

with the inter-frame difference method and puts forward

multi-step accumulation of inter-frame difference

algorithm according to the properties of smoke motion

accumulation. The purpose of the algorithm design is to

find out the possible regions with moving objects rather

than the detection of moving objects. This algorithm

extracts initial multi-frames from video to processing and

uses different inter-frame step size to find out the inter-

frame difference value and make the cumulative sum to

generate many motion saliency maps, and after the

integration of the subtraction and summation of the

motion saliency maps and their average background,

Gauss filtering method is used to generate the motion

saliency regions map of multiple video frames.

1

(x, y) ( (x, y) B(x, y))i

M

i

f R

(1)

1

(x, y)= ( (x,  y) (x, y))N

n m j

j

R I I

(2)

1

1B(x, y) (x, y) 

M

i

i

RM

(3)

2 2

222

1(x, y)

2

x y

g e

(4)

Re (x, y) (x, y) (x, y)

1,MotionRegions(x, y) Threshold

0,MotionRegions(x, y) Threshold

Motion gions g f

(5)

In formula (1) to (5), Motion Regions represents motion areas, Ri (x, y) means the cumulative sum of

every frame step size, and B(x, y) is the background after

the multi-frame cumulative mean value, and let nframe be

the extracted video frame size, in which ki is the first

frame step size, M is the maximum frame step size,

1≤m≤n≤nframe, n=m+ki, 1≤ki≤nframe,1≤i≤M,M=

「(nframe/ki) , and ‘「’is top integral. In formula (2), the

Gaussian filtering window is , and take the pixel after

Gaussian filtering.

a. images b. MotionRegions

k1=10 k2=20 k3=30 k4=40 k5=50

k 6=60 k1=70 k8=80 k9=90 k10=100

Figure 1. Salient regions

Taking the video frames in figure 2 as the example, the

method of multi-frequency step size is adopted in the

algorithm. If the first 800 frames of the initial video are

taken, i will increase every other 10 frames from the step

size of 10 frames to that of 100 frames, and a motion

saliency map of 10 frame difference accumulation will be

generated. Motion smoke will appear in video frames of monitoring images, where the extracted moving objects

782 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014

© 2014 ACADEMY PUBLISHER

Page 3: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

with small inter-frame step sizes on driving cars,

branches and grasses shaken with wind are more and

blurry, and the visual saliency of all objects is smaller;

when the inter-frame step size grows gradually, the visual

saliency of smoke and driving cars will be more and more

obvious, and when the inter-frame step size is too long

such as 100 frames, only the visual saliency of smoke can be extracted and the smoke edge features can be observed

in the visual saliency map of accumulated frame

differences of all step sizes.

B. Extraction of Salient Rectangle Regions

The visual saliency targets observed in 2.1 experiment

are the result of multi-frame accumulation, there’s no

corresponding salient object for each frame, but it can be

determined that salient objects are the appearing and

appeared area. Through the accumulation of multiple

saliency maps, the motion region of salient objects can be

locked in the regions, and making the extraction of every

frame of salient objects can greatly improve operation efficiency. Motion Regions in figure 1 can give us the

region where salient objects appear, and in order to

process visual features in video frames, we use the

following step for the generated salient objects and get 2

rectangle regions as shown in figure 2.

;?    

    

, ,

{i,1}(:,1)

min(B{i,1}(:, 2))

max(B{i,1}(:, 2

max(B{i,1}(:,1))

 

 

, ?

,

rectangle (x

)

y)

)

,

i

i

i

i

i

B L bwboundaries MotionRegions noholes

MinRow min

MaxRow

MinColumn

MaxColumn

x MinRowi MaxRowi and

y MinColum

B

, ?

  

0,

ni MaxColunmi

otherwise

a. Mask b. Image

Figure 2. Salient rectangle regions

III. EXTRACTION OF SALIENT MOVING OBJECTS

In 1996, Olshausen and Field put forward the sparse

coding theory [13] thinking that sparse coding of the

human brain is a linear overlapped model, which

optimizes learning to get the primary function that is

similar to simple cell response features. When identifying

images, the human brain adopts a kind of ‘sparse coding’

strategy that is also called minimal entropy coding [14],

and the entropy is the smallest part of the image. When the sparse expression is recovered, the restored image is

also the part of the minimum entropy in the image; hence,

through the difference value between the original image

and the restored image, we can get a part of the maximum

entropy. The research shows that the region with human’s

visual attention is supposed to be the area in the image

with the maximum entropy, therefore, the region of the

difference value between the original image and the

restored image is the region with visual attention.

An image can be said as a multi-features matrix

composed of various types of characteristic vectors, and

the matrix can be decomposed into two parts of low rank matrix and sparse matrix corresponding to the image

background region and salient objects respectively. The

low rank and sparse decomposition of the matrix has

become the current research hotspot due to its advantages

such as its good robustness and strong generalization and

capacity of resisting disturbance, and its disadvantage is

that the sparse representation method needs too many

training images and causes a overlarge calculated amount. And the calculated amount of the image blocks of salient

region in figure 2(b) to achieve the low bank and sparse

matrix decomposition will be reduced significantly.

A. Image Sparse Representation and Dictionary Learning

Sparse representation is a solving process by using

non-zero coefficient as less as possible to represent main

information of signals so as to simplify the signal

processing problem [15]. In the multi-video frames,

extract the image blocks with the same size to constitute a

super large image X, namely image X={x1, x2, …, xn}, in

which n is the number of image blocks. Each image block

can be got by the linear product of a group of sparse base

and a dictionary, namely i i ix D e , in which D is the

dictionary in sparse representation that is a group of over-

complete base vectors used to represent the data of all

image blocks more effectively, i is the sparse vector,

and ie is the differences of all image blocks and also the

sparse representation residual. Matrix , ,m nD R m n

is generally full rank. Vector ,n m

i iR x R . Now

,ix D is known, and solve ,i ie . Because m n , this

equation set is undetermined, but if we hope its solution

is sparse as much as possible: namely the number of non-

zero entries is as less as possible, the sparse of an image

block can be represented by its 0 norm, and the constraint

condition is:

0

minrank( )i ie (6)

In formula (6), is the minimum parameter to

balance sparse residual, but this optimization problem is

also difficult to be solved. In 2006, Terrence Tao et al.

[16] proved that under the condition of RIP, the

optimization problem of 0 norm and the optimization problem of 1 norm have the same solution, namely:

When RIP condition is met and i 0(x ) 2 i , i(x )

will be defined as the number of vectors included in the

minimum linearly dependent column vector set. The

constraint condition of the sparse representation is:

* 1

min i ie (7)

* is the trace-norm, namely the sum of singular

values of a matrix. Directly take the local geometric

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 783

© 2014 ACADEMY PUBLISHER

Page 4: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

construction as detailed as you can from the sample set of

image blocks to constitute an over-complete dictionary

with not only one representation but with the sparsest

solution, which is based on the over- complete dictionary

of relevant sample set learning, and the basic unit of

atomic energy matches the various peculiar features

inherent in the image itself, such as the outline, edge, texture and other local geometric constructions. The

image sparse decomposition based on the dictionary can

make signal energy concentrated on a few atoms, and it is

precisely those atoms with nonzero large coefficients that

match the essential features of the image. The super-

resolution reconstruction technique based on sparse

representation mainly includes two process of the

establishment of the over-complete dictionary and the sparse representation of the image blocks to

corresponding dictionary. Using sparse representation to

extract salient objects in a video mainly takes the over-

complete dictionary learning and image sparse

representation as the research contents.

Dictionary learning is to look for the optimal basis

structure under sparse representation, and it not only can

satisfy the only conditional constraint of sparse representation, but also can get the sparser and more

precise representation. In order to meet the above

conditions, all training sets need to be solved:

2

1,2

arg min i i iD

i

x D e

(8)

Formula 8 is completed by two parts usually. Firstly,

solve the sparse representation of signals according to the

current dictionary, then update the dictionary according

to the sparse representation after the solution. K-singular

value decomposition (K-SVD) algorithm [17] firstly uses orthogonal matching pursuit algorithm to solve the sparse

representation in the first step, then consider that only

update kd in k column of the dictionary D and the

corresponding coefficient k

Tx , and if the residual term

0ie of the sparse representation on the above formula is

not considered, formula 8 can be rewritten as:

2 2

1

2

1

2

2

( )

i i Fi

Kj

j T

j F

j k

j T k T

j k F

k

k k T F

x D X D

X d

X d d

E d

(9)

Among them, kE means using the residual represented

by the image blocks except for k column in the dictionary,

and to make the overall formula should be the smallest, k

k Td should be most close to kE . Therefore, for SVD of

kE , T

kE U V , kd needs to be the first column of U,

and k

T is the first column of V to multiply by (1,1) .

m mU R and n nV R are orthogonal matrix, and is

diagonal matrix.

B. Sparse Matrix Solution

Every frame of rectanglei (x, y) image blocks in figure

2 (b)form a new frame block sequence in the video, and use formula (8-9) to complete the low rank and sparse

decomposition of the matrix to find out the residual Ei

represented by image blocks, namely the salient objects

of every image block.

Figure 3. Salient objects solved by matrix sparse decomposition.

(a)Frame image in salient regions,(b) Frame block sequence,(c) Matrix

composed of frame blocks, (d) Matrix decomposition

In video blocks as shown in figure 3, we only consider

the gray image sequence of the first 600 frames, and taking the image sequence of frame 1 as an example, the

image resolution of each frame is 100×80. And the

dimension of the data matrix D composed of these image

blocks is 8000×600.Robust Principal Component

Analysis (RPCA) mainly solve the problems of D=A+E,

in which A represents for low rank, E is for sparse, D is

known, and assume that A is background data and E is

salient objects. This algorithm has the features of being insensitive to noise and being able to process data of high

dimensional images. There are many methods in RPCA

solution algorithm, in which Iterative Shrinkage

Thresholding (IST) algorithm has a simple and

convergent iterative form, but its speed of convergence is

slow, and it is hard to select appropriate step sizes with

limited application scope. Augmented Lagrange

Multiplier (ALM) algorithm has a fast calculation speed and can reach a higher precision, and it needs lower

storage space. Inexact Augmented Lagrange Multiplier

(inexact ALM) has improved ALM, and it doesn’t need

precise solutions, namely the iterative update formula of

matrix A and E is:

1 1

1/ 1

arg min (A,E ,Y , )

( Y / )k

k k k kA

k k k

A L

D D E

(10)

1 1

/ 1

arg min (A ,E,Y , )

( Y / )k

k k k kE

k k k

E L

S D A

(11)

k is a weighting parameter, 1

/ 4k m n D , Yk is

a matrix of the same shape and D.

C. Extraction of Pixels of Salient Moving Objects

The sparse matrix from the sparse figure3 (d) matrix

decomposition has locked the foreground objects in the

video frames, but the target pixels are vague, and we

adopt the method of maximum pixel region segmentation

(MPRS) to achieve the extraction of pixels of salient

784 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014

© 2014 ACADEMY PUBLISHER

Page 5: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

objects. Firstly, use the frame difference method Ei and

E1 to find out differences and get salient motion pixels,

filter out the preserved pixels during image processing by

setting threshold, use the fourth connected domain to

mark the background as 0, find out the region with the

maximum number of pixels as the pixels of salient

objects, and extract the J gray image pixels in Figure3 (b) Rectanglei to generate the pixels of salient objects as

shown in figure 4.

1

 

, , 4

  ,

_

, ?

2 ,

Eimg Ej E

Threshold mean Eimg

K Eimg Threshold

L num bwlabel K

m n find num max num

Object num size m

x y find L m

liencyObject rgb graysa Rectanglei j x y

Figure 4. Saliency object

IV. SMOKE TARGET DETECTION

A. Extraction of Motion Regions

Smoke has obvious motion attributes and in the

process of movement, it will diffuse and grow with the

cumulative motion and through detection the growth in

target regions, we can lock the suspected smoke region.

Adopt the maximum connected region for the salient

objects in all regions in figure 4, outline the positions and

scopes of all motion salient objects existing in all video frames, and extract the maximum and minimum

horizontal and vertical coordinate values of all objects for

judging the changes of target regions. Among them,

(xmin, ymin), ( xmin, ymax), ( xmax, ymax), and (xmax,

ymin) mean the four points in the rectangular region R

where object borders are, and judge the growth of target

regions shown in figure 6 by calculating the changes of

the rectangular region. For the smoke drift feature and other significant

moving objects are featured with the regional growth and

displacement slow-moving, the following conditions are

used for the constraint to find out the smoke moving area:

1) The vertical or horizontal axis grows in the regional

growth; 2) The significant moving object of the latter

frame occupies a larger area than the former frame (the

frame interval does not surpass 10); 3) The minimum horizontal axis of the latter frame is smaller than the

maximum horizontal axis of the former frame to ensure

the feature of slow moving in the drift area.

(( ) ( ))

0 10

0 0 0

,

(

?

)

i i i

i i i

ij i j

i i ij i j

i i

x xmax xmin

y ymax ymin

R R R i j

if x or y and R and xmin xmin

then bwboundaries R blue R is Suspicious smoke area

( )

Figure 5. Suspected smoke object’s Contour in Rectangle2

B. Color Detection of Smoke

At the initial stage of fire, the smoke color shows

translucence that will weaken the saturation of the color

in the scene while it has little influence on the color

saturation of other non-smoke moving objects. Therefore,

the color saturation ratio of all pixels in saliency moving objects can be taken as the characteristic value of smoke

detection. For it is hard to find out the accurate judgment

standard due to the color feature presented by the

inflamer’s different smokes, we can find that the

brightness in the region the smoke appears is changed

obviously. The suspected smoke area can be identified

through the mean filter of the brightness. Combined with

the movement growth region in Figure 6, it is easy to find out the smoke object as shown in Figure 6-7. The hue,

saturation and brightness analysis are conducted in HSV

color model space.

The analysis of color saturation is conducted in HSV

color model space. HSV color model is a color model

using hue (H), saturation(S) and brightness (V) to

describe colors, and it belongs to nonlinear color system.

We also take color statistics into account in order to localize smoke, as its color is usually whitish-blue. We

convert our RGB frame into an HSV signal, where the

Value channel stores the biggest value of the RGB

channels. The Saturation channel is computed by taking

into account the following condition:

max max(R,G,B)

min min(R,G,B)

max (G B) / (max min)

max 2 (B R) / (max min)

max 4 (R G) / (max min)

60

0 360

V max (V)

S (max min) / max

if R then H

if G then H

if B then H

H H

if H then H H

mean

Figure 6. Saliency region

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 785

© 2014 ACADEMY PUBLISHER

Page 6: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

Figure 7. Detect smoke object

In Figure 7, H, S and V values of each pixel are

extracted for the detected smoke object can form the

color learning database of the video frame. There is no

unified standard for the smoke color feature detected in different scenes and different environments. Through the

learning, the smoke color visual feature can provide a

reliable basis for the smoke detection of the subsequent

frames. In HSV color space, the smoke hue and saturation

ratio are found out as the smoke judgment features.

0.3 0.7

0.05 0.4

0.5 1

i

i

i

H

S

V

(12)

Figure 8. Smoke object

V. EXPERIMENT

A. Significant Moving Target Detection Through Various

Methods

The smoke movement is more obvious, and it displays

the upward trend along with the time accumulation. The

paper adopts the moving object detection based on

significant region, which extracts the reduced detection

area the smoke may exist, thus further improving the

smoke detection speed. The small matrix low rank and

sparse decomposition are used to separate the moving object from the background, the significant area can not

only reduce the detection range, but also reduce the

handing capacity of the redundancy background.

The traditional movement detection usually adopts the

frame difference, and the threshold is used to judge

whether there is the object movement in the image

sequence. The method is less sensitive to the scene

changes such as the light, it can adapt to various dynamic environments with a good stability, while the distance

between the frames detected in the method is too small,

the detected objects are more and complex, the movement

saliency is lower. When the distance between the frames

is larger, it can detect the target with larger movement

significance, while it can neglect a lot of detail changes

between the frames. In recent years, the visual saliency

calculation method based on frequency domain analysis has become the research hotspot with the advantages of

simplicity and fast operation speed. The frequency

domain phase difference method can find the moving

object in the dynamic background, the method is more

ideal to detect the pedestrian, while the omission ratio on

the smoke and other objects with specific movement

Figure 9. Moving object saliency map with various methods

Figure 10. Comparison on the smoke detection effects with four

algorithms

feature is higher; The sparse decomposition method

ignoring the image structure and visual feature and

conducting the matrix analysis from the mathematical

perspective is a hotspot among the researches of signal

processing field. It can express the signal as a kind of sparse form, and the image can be expressed as the multi-

feature matrix composed by various types of eigenvectors.

The matrix can be decomposed into low rank and sparse

matrix to be corresponding to the image background area

and significant object. The saliency map can be obtained

through the inference of sparse elements in the matrix.

The method can detect the significant moving object,

while the omission ratio on the smoke moving object is also higher, and the computational complexity is higher.

The paper adopts the moving object detection based on

salient region, the multi-frame accumulation at the front

part of the video can be used to find out the area

occurring the movement and extract two small matrices.

786 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014

© 2014 ACADEMY PUBLISHER

Page 7: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

The matrix decomposition’s unconditional feature and

versatility can be used to obtain the motion saliency map.

Through the experiment, we can find that the method can

track the smoke change at each frame in the video. For it

adopts the significant region small matrix, the calculation

speed is faster with a good real-time, and it can reflect the

edge feature of the moving object. As can be seen from figure10 accuracy of our detect method significantly

better than the other three methods. Meanwhile, the

smoke was detected 100 images only a 0.036S.

B. Smoke Detection in Different Environments

Smoke detection affected in different environments,

such as different lighting, different indoor and outdoor

scenes, different combustion. In Figure 11, the above

described method is used to detect motion region, and

used brightness filter of HSV color space to detect smoke.

Experimental results show that the proposed method is

adapted to different types of smoke detection. Video used

in the experiment from sigal, image, and video process group of Bilkent University [18].

Figure 11. Smoke detection in different environmental

VI. SUMMARY

Most fire monitoring under the complex environment

adopts the visual detection method, while smoke is the precursor and accompanying product of the fire, how to

real-time find the smoke visual information within the

horizon scope among the massive video images is the key

of the research. The surveillance video researched in the

paper is in the outdoor scene, and the sight exist the

complex cases of smoke, moving car and swinging tree

branch with the wind. In order to improve the detection

accuracy and speed, the paper proposes the method based on saliency moving region. With the feature that the

smoke displays the diffusion and moving accumulative

growth, it proposes the moving growth area algorithm

complying with the smoke movement drift. Meanwhile,

based on the feature that the smoke color is light at the

early of the fire with the translucence, the difference

between the brightness and background is larger, HSV

color space filter is used to lock the smoke target. The experiment compares and analyzes the effects of current

mainstream visual saliency methods for smoke detect, the

detection speed and accuracy of the method adopted in

the paper are obviously superior to other methods.

Moreover, the method can be applied to different

monitoring scenes. Even in the scene with low resolution

and strong noise, it can also achieve a good detection

effect. The experiment can show that the research method

in the paper for the visual smoke detection improves the

accuracy rate, reduces the false and missing report rate,

and it has a better robustness on the detection of various kinds of smoke objects under the different environment.

ACKNOWLEDGMENT

The corresponding author is Zhao Hongwei. The

authors are grateful to the anonymous reviewers for their

insightful comments which have certainly improved this

paper. This work is supported by Plan for Scientific and

Technology Development of Jilin Province

(20140101184JC).

REFERENCES

[1] T. H. Chen, Y. H. Yin, S. F. Huang, and Y. T. Ye. “The smoke detection for early fire-alarming system base on

video processing,” In Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP’06, 2006.

[2] Wang Y, Chua T W, Chang R. “Real-time smoke detection using texture and color features,” Pattern Recognition (ICPR), 21st International Conference on IEEE, pp. 1727-1730. 2012

[3] Cui Y, Dong H, Zhou E. “An early fire detection method based on smoke texture analysis and discrimination,” Image and Signal Processing, CISP'08. Congress on. IEEE, pp. 95-99, March 2008.

[4] Toreyin B U, Dedeoglu Y, Cetin A E. “Contour based

smoke detection in video using wavelets,” European Signal Processing Conference. pp. 123-128, 2006.

[5] Li W, Fu B, Xiao L, et al. “A Video Smoke Detection Algorithm Based on Wavelet Energy and Optical Flow Eigen-values”. Journal of Software, pp. 63-70, 2013

[6] F. Yuan. “A fast accumulative motion orientation model based on integral image for video smoke detection,” Pattern Recognition Letters, vol. 7, pp. 925–932, 2008.

[7] Kopilovic I, Vagvolgyi B, Szirányi T. “Application of panoramic annular lens for motion analysis tasks: surveillance and smoke detection,” Pattern Recognition, Proceedings. 15th International Conference on. IEEE, vol.

4, pp. 714-717, 2000. [8] Liu J, Zhao H, Zhao T. “Research of Flame Detection on

Visual Saliency Method”. Journal of Computers, pp. 3264-3271, 2013.

[9] Zhou B, Hou X, Zhang L. “A phase discrepancy analysis of object motion,” Computer Vision–ACCV 2010. Springer Berlin Heidelberg, pp. 225-238, 2011.

[10] Xiao-liang Q, Lei G U O, Jun-wei H A N. “A Spectral Algorithm Based on Weighted Sparse Coding for Visual Saliency Detection,” ACTA ELECTRONICA SINICA, vol. 6, pp. 1159-1165, 2013.

[11] Lee L, Romano R, Stein G. “Introduction to the special section on video surveillance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, pp. 740-745. 2000

[12] Li Gang, Qiu Shangbin, Lin Ling, Zeng Ruili. “New moving target detection method based on background differencing and coterminous frames differencing,” Chinese Journal of Scientific Instrument. vol. 8, pp. 961-964, 2006. (Chinese)

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 787

© 2014 ACADEMY PUBLISHER

Page 8: Research of Smoke Detection on Visual Saliency …...Email hr000539@163.com Hongwei Zhao* College of Computer Science and Technology, Jilin University, Changchun, China, 130012 *Corresponding

[13] Olshausen B A. “Emergence of simple-cell receptive field properties by learning a sparse code for natural

images,”Nature, vol. 6583, pp. 607-609, 1996. [14] Olshausen B A. “Learning sparse, overcomplete

representations of time-varying natural images,” Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on. IEEE, vol. 11, pp. 41-44, 2003.

[15] Yang J, Wright J, Huang T S, et al. “Image super-resolution via sparse representation,” Image Processing, IEEE Transactions on, vol. 11, pp. 2861-2873, 2010.

[16] Candes E J, Romberg J K, Tao T. “Stable signal recovery from incomplete and inaccurate measurements,”

Communications on pure and applied mathematics, vol. 8, pp. 1207-1223, 2006.

[17] Aharon M, Elad M, Bruckstein A. “svd: An algorithm for designing overcomplete dictionaries for sparse representation,” Signal Processing, IEEE Transactions on, vol. 11, pp. 4311-4322, 2006.

[18] Hakan Tuna, Ibrahim Onaran, and A. Enis Cetin "Image Description Using a Multiplier-Less Operator", IEEE

Signal Processing Letters, vol. 16, s 2009.

Jun-ling Liu, a Ph.D. candidate at the College of Computer

Science and Technology in Jilin University, and a Associate Professor of Computer Science in Jilin Teacher’s Institute of

Engineering and Technology. Her research interest covers

Scenes Recognition and Vision Saliency. Hong-Wei Zhao, a professor at the College of Computer Science and Technology, Jilin University. He is a corresponding author of this paper. His research interest covers Embedded Systems and Cognitive Computing.

788 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014

© 2014 ACADEMY PUBLISHER