Top Banner
INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010 © Copyright 2010 All rights reserved Integrated Publishing services Research article ISSN 0976 – 4380 327 High Resolution Data Processing for Spatial Image Data Mining Md Ateeq Ur Rahman 1 , Shaik Rusthum 2 1 – Professor, Department of Computer Science & Engineering, Shadan College of Engineering & Technology, Hyderabad, India 2 – Professor & Principal, Brilliant Institute of Engineering & Technology, Hyderabad, India [email protected] ABSTRACT This paper contributes towards the development of adaptive learning system for automated segmentation and prediction of isolated regions in given spatial images. The effect of spatial distortion is observed in the spatial images under different processing noise conditions. A method for image denoising, shape and textural feature information using multi wavelet transformation is suggested. The regions in the image are estimated using global graph theory technique. A methodology to provide guidance for mining Remote sensing image data is proposed. To improve the accuracy of estimation, hierarchal clustering over distributed data sample is presented. The concepts of linear relation among various clusters are explored and are incorporated in data mining approach. The performance of retrieval time and classification accuracy has been evaluated for various cases. Keywords: Clustering, Data Mining, Denoising, Spatial Image Processing, Wavelet Transformation, Representative features. 1. Introduction With the development of technology and the advances in scientific data collection we are faced with a large and continuously growing amount of data which makes it impossible to interpret all this data manually. This problem is more severe in spatial database as both the number and the size of the spatial database is rapidly growing because of large amount of images obtained from satellites and other scientific equipment. Remote image sensing from airborne and spaceborne platforms provides valuable data for mapping, environmental monitoring, disaster management and civil and military intelligence. However, to explore the full value of these data, the appropriate information has to be extracted and presented in standard format to import it into geo-information systems and thus allow efficient decision processes. Although there has been a large research effort in content-based image retrieval (CBIR) techniques, the specific problem of mining remote sensing image databases has received much less attention. Data mining is a process of automatic extraction of novel, useful and understandable patterns from a large collection of data. Over the past decade data mining or knowledge discovery in databases (KDD) has become a significant area both in academia and in industry. The growth in the size and number of existing databases far exceeds human abilities to analyze such data, thus creating both a need and an opportunity for extracting
16

High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

Jun 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

327

High Resolution Data Processing for Spatial Image Data Mining Md Ateeq Ur Rahman1, Shaik Rusthum2

1 – Professor, Department of Computer Science & Engineering, Shadan College of Engineering & Technology, Hyderabad, India

2 – Professor & Principal, Brilliant Institute of Engineering & Technology, Hyderabad, India

[email protected]

ABSTRACT This paper contributes towards the development of adaptive learning system for automated segmentation and prediction of isolated regions in given spatial images. The effect of spatial distortion is observed in the spatial images under different processing noise conditions. A method for image denoising, shape and textural feature information using multi wavelet transformation is suggested. The regions in the image are estimated using global graph theory technique. A methodology to provide guidance for mining Remote sensing image data is proposed. To improve the accuracy of estimation, hierarchal clustering over distributed data sample is presented. The concepts of linear relation among various clusters are explored and are incorporated in data mining approach. The performance of retrieval time and classification accuracy has been evaluated for various cases.

Keywords: Clustering, Data Mining, Denoising, Spatial Image Processing, Wavelet

Transformation, Representative features. 1. Introduction

With the development of technology and the advances in scientific data collection we are faced with a large and continuously growing amount of data which makes it impossible to interpret all this data manually. This problem is more severe in spatial database as both the number and the size of the spatial database is rapidly growing because of large amount of images obtained from satellites and other scientific equipment. Remote image sensing from airborne and spaceborne platforms provides valuable data for mapping, environmental monitoring, disaster management and civil and military intelligence. However, to explore the full value of these data, the appropriate information has to be extracted and presented in standard format to import it into geo-information systems and thus allow efficient decision processes. Although there has been a large research effort in content-based image retrieval (CBIR) techniques, the specific problem of mining remote sensing image databases has received much less attention.

Data mining is a process of automatic extraction of novel, useful and understandable patterns from a large collection of data. Over the past decade data mining or knowledge discovery in databases (KDD) has become a significant area both in academia and in industry. The growth in the size and number of existing databases far exceeds human abilities to analyze such data, thus creating both a need and an opportunity for extracting

Page 2: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

328

knowledge from databases (Jiawei Han et al., 1992). Applications of various data mining techniques for image analysis and classification are reported recently in the literature (Jiawei Han et al., 1992, Schroder et al., 2000 and Rushing et al., 2005). Common drawbacks of most of the methods are: (i) most of the applications studied and reported in the literature are performed using image data where the number of the classes is known a priori, (ii) applying image analysis often involves interactive or semiautomatic procedures and (iii) application of unsupervised learning algorithms usually results in discovery of classes that have no real physical interpretation. Our overall goal is to demonstrate that a novel data mining approach employing multiwavelets can be used for automated aerial image objects recognition and classification overcoming the existing drawbacks. Wavelet theory could naturally play an important role in data mining since it is well founded and of very practical use (Tao Li et al., 2002). Wavelets have many favorable properties, such as vanishing moments, hierarchical and multiresolution decomposition structure, linear time and space complexity of the transformations, decorrelated coefficients and a wide variety of basic functions. These properties could provide considerably more efficient and effective solutions to many data mining problems. First, wavelets could provide simplified presentations of data that make the mining process more efficient and accurate. Second, wavelets could be incorporated into the kernel of many data mining algorithms. Although standard wavelet applications are mainly on data which have temporal/spatial localities (e.g. time series, stream data, and image data) wavelets have also been successfully applied to diverse domains in data mining. In practice, a wide variety of wavelet-related methods have been applied to a wide range of data mining problems. Although wavelets have attracted much attention in the data mining community, there has been no comprehensive review of wavelet applications in data mining. To incorporate wavelet transformations in Data mining conventional techniques were suggested in the past (Tao Li et al., 2002). These techniques use various filters in order to decompose image into resolution information and process it for its recognition. Though these techniques are suitable for general image samples, it is observed they have lesser recognition accuracy when coming to spatial image data sets. As spatial datasets are consisting of large variations, conventional wavelet techniques are not suitable. Multiwavelet transformation has recently emerged as a new image representation approach for resolution decomposition. In this work an approach to integrate multi wavelet transformation for exploiting high textural variations in spatial images for better recognition and classification is suggested. 2. Spatial Image Processing

Spatial Image Processing can be defined as the act of examining spatial images for the purpose of identifying objects and judging their significance (www.gisdevelopment.net). Spatial Image Processing is done to study the remotely sensed data for detecting, identifying, classifying, measuring and evaluating the significance of physical and cultural objects, their patterns and spatial relationship. While dealing with spatial image data there is a need for processing the image so as to provide high resolution information for processing and recognizing efficiently. The manual observation of such images is a tedious task and the observing parameters for estimation in the captured images are also

Page 3: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

329

erroneous due to noise introduced during image capturing, transmitting as well as due to various environmental conditions. Powerful signal processing methods are developed to explore the hidden information in advanced sensor data (Curlander and Kober, 1992, Tsatsoulis, 1993, Heene and Gautama, 2000). However, all these signal processing algorithms are applied on pixels or rectangular areas and do not take into account contextual information. Image processing methods, data and information fusion have to be added to exploit the full information, both of the physics of the sensor measurements and the context within the scene. We propose a new strategy to overcome the above problems by employing Spatial image Processing and its application to Data mining for the automatic interpretation of remotely sensed imagery. A model of expert systems for image processing is introduced that discusses which and what combinations of image processing operators are effective to analyze an image for its recognition and interpretation. To develop a methodology overcoming the issue of noise effect and feature extraction obtaining high resolution variation has been focused in this paper. The developed methodology for the stated approach is outlined in the following sections. 3. Spatial Image Denoising using Multiwavelet approach

In remote sensing, during the process of image capturing and transfer various sources of distortions are introduced in the spatial images. This impacts the estimation of the bounding regions of the captured image. To obtain the estimation accuracy and to find an accurate estimation of regions in given image, resolution estimation and filtration technique using wavelets were proposed in (Chambolle et al., 1998, Grace et al., 2000, Starck et al., 2002 and Javier Portilla et al., 2003). One of the widely used applications of wavelet decomposition is the removal of additive white Gaussian noise from noisy images. The variation between the actual image and noise is achieved by choosing an orthogonal basis, which efficiently approximates the image with few nonzero coefficients. In the conventional denoising method, the image enhancement is obtained by discarding components below a predetermined threshold value (Chambolle et al., 1998 and Grace et al., 2000). This thresholding technique assumes that the noise on each coefficient is independent. However, this assumption may not be always valid for the coefficients of a multiwavelet decomposition. Therefore, a multivariate thresholding scheme for one-dimensional images using multiwavelets is introduced. Instead of thresholding individual multiwavelet coefficients, coefficient vectors are considered and thresholding operation is applied to the whole vector. Using this threshold value a process of multi wavelet approach for spatial image denoising is developed as outlined below. The captured noisy image is firstly decomposed with Geronimo, Hardin, and Massopust (GHM) multiwavelets, then each coefficient vector is thresholded with the universal threshold value. The multiwavelet decomposition is similar to the wavelet decomposition, but has some differences. The multiwavelet filterbanks have two channels, so the decomposition will have two sets of scaling coefficients and two sets of wavelet coefficients leading to better precision. For image denoising applications, the spatial

Page 4: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

330

dependence of the pixels should be restored so that coefficients with high correlation can undergo a common thresholding operation. A vector of length four can be formed by using the coefficient from each of the four subbands. These coefficients are chosen such that they have the same location in their respective subbands. Then, a multivariate thresholding operation is applied to each of these vectors. This procedure is repeated separately for all of the coefficients in all three subbands HH, HL, and LH. Thresholding technique treats the highly correlated multiwavelet coefficients simultaneously and applies a common threshold to the vector of coefficients. The effect of correlation is compensated by a whitening transformation. The covariance matrix used for the transformation depends on the decomposition level and is calculated for the HH, HL, and LH subbands separately at each decomposition level. The value of threshold ‘λ’ is computed where the threshold value λ is the accumulation of all sequences λN such that,

P(M ≤ λN) 1 as N ∞ (1)

Where P is probability, M is Row and N is Column.

As the number of pixels go to infinity, the probability of the threshold being greater than the maximum of the noise random variables approaches to one. This guarantees that, with high probability, the image component exists in coefficients that are larger than the threshold value. The threshold for the method is defined as,

λN = sqrt(2 logN + 2loglogN) (2)

Therefore, this is the value that is used for thresholding the multiwavelet coefficient vectors. Using this threshold value the given spatial image is denoised. This value of threshold varies with the density of noise level affecting the image. These denoised coefficients are used for the region prediction and segmentation approach to isolate the distinct regions in given image based on global graph method. 4. Region Segmentation

To estimate the regions in the given image, the image is traced using a global graph theory technique. The Global graph theory technique is used for detection and edge-based association, representing the edge segments in the form of a chart and finds the diagram for low-cost paths that correspond to significant edges. This technique is simpler in operation and provides good performance even in the presence of noise. As compared with previous methods (Yuriet et al., 2001 and Ding et al., 2001) this process is considerably less complex and requires less processing time. In this Graph theory concept first we define a graph G = (N, U) where N is set of nodes, and U is a set of arcs. Each pair (ni, nj) of U is called as arc. A graph in which the arcs are directed is called a directed graph. If an arc is directed from node ni to node nj, then nj is said to be a successor of the parent node ni. In each graph we define levels, such that level 0 consists of a single node, called the start or root node, and the nodes in the last level are called goal nodes. An

Page 5: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

331

expansion is defined as identifying the successor of the nodes from root node to goal nodes. Next defining a cost function C( ni, nj) can be associated with every arc (ni, nj). Path n1, n2,…,nk, with each ni, being a successor of nj-1. Cost of a path is the sum of costs of the arcs constituting the path. Edge or Arc element defined between two-neighbor pixels p and q is (xp, yp) (xq, yq) then cost associated for that edge element is defined using equation 3.

C(p,q) = H −[ f ( p) − f (q)] (3)

Based on the low cost value the theory will works to finding the lowest cost path for finding the edge elements and then they are extracted. It is observed that the edge elements are estimated based on the lowest cost path described from the obtained result. The method chooses the region based on the cost given to each possible edge region and wrt the highest threshold value of the given region. Based on the obtained edge points with minimum cost the boundaries are declared. This gives the concept of selecting the region of edges wrt the highest pixel, which results in the least edge points and provides the minimum pixel bounding regions as the cost with minimum is chosen as the best edge links. These edge links together forms a bounding region for each distinct region of the given spatial image. The bounding region predicted are then segmented using image cropping operation with the specified location coordinates. These isolated regions are now processed for the extraction of global feature of shape and texture feature estimation for objects defining and its recognition from the given image. 5. Descriptive Feature Estimation

The isolated regions are processed for shape and texture features estimation and are trained for the extracted region. The features extracted for the estimation are the global shape feature and the textural variations of the image sample used. The features are extracted using the approach of multiwavelet transformation.

5.1 Shape Feature Extraction

Shape is recognized as one of the most fundamental characteristics that describe image content. During the past decades many shape retrieval methods have been proposed (Loncaric, 1998, Veltkamp and Hagedoorn, 1999, Zhang and Lu, 2004). To achieve the invariances, most approaches (e.g., medial axes, shape signature, and all the moment-based methods) have to apply a normalization process beforehand. The most popular normalization techniques for planar shapes involve the computation of the center of gravity, the orientation of the principal axis and the size of the bounding box. However, the normalization is difficult when one must deal with variable planar shapes that themselves can vary. The normalization process may easily result in errors, and decrease the accuracy of shape retrieval.

Page 6: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

332

To overcome the disadvantages mentioned above, instead of applying normalization and analyzing directly in the spatial domain, the proposed method uses the wavelet and Radon transform to parameterize the generalized morphological properties of the shapes. Radon transformation of a function f(x,y) is the integral of the lines in the mapping plane of the function.

(4)

Where t is the distance in the line, the integral of a series of parallel lines becomes the projection and the set of all the projection is the Radon transformation. For extracting the image shape features effectively, translation invariant and scale invariant of Radon transformation are constructed. Let Radon transformation of image is , then the k-steps distance is defined as,

(5) Centroid of can be defined as,

(6) The translation of image will make the projection translate. So, central moment can demonstrate the translation invariant.

(7)

Let, (8)

Scale invariant can be derived as,

(9)

Matrix can be derived from the above Equation, singular values of the matrix is calculated to get a Four-dimensional vector Fs. This vector is the shape feature vector of the image.

5.2 Texture Feature Extraction

The second feature used for the description is the texture feature. Texture is the tendency of gray-levels of pixels in the image and is a kind of region feature. The changing of gray-level of the nearby pixels is interrelated. So texture can be defined as the graphic produced by a certain variant of gray-levels. Several approaches for extraction of texture features have been proposed in literature. Varieties of model and feature based approaches using wavelet transformation has been proposed for image texture classification (Unser, 1995 and Aujol et al., 2003). Until recently, only scalar wavelets (wavelets generated by one scaling function) were widely used. But one can imagine a situation when there is more than one scaling function. This leads to the notion of multiwavelets, which is a more recent generalization with higher numbers of distinct scaling functions than wavelets, offering further theoretical and experimental advantages.

Page 7: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

333

Multiwavelets can simultaneously provide perfect reconstruction, while preserving length (orthogonality), good performance at the boundaries (via linear phase symmetry) and a high order of approximation (vanishing moments). The high frequency sub-layer obtained from the multiwavelet has numerous texture information of the original image. So the coefficient of wavelet in high frequency sub-layer can demonstrate the texture feature of image. In this paper, 2-layers GHM is used for multiwavelet and the image is divided into 27 sub-layers. Feature coefficient can be got from the wavelet coefficient in the three highest frequency sub-layers. Let wavelet coefficient of H1 sub-layer be

(10)

Eigenvalue f1 of diagonal direction is derived as:

(11)

Eigenvalue f2 and f3 of vertical and horizontal directions can also be derived as:

(12)

(13)

The three eigenvalues(f1, f2, f3) construct a three dimensional vector Ft to demonstrate the texture feature. These features are estimated for each distinct object of the region and are compared with a distinct feature table containing averaged feature vector of various spatial similar object segmented and processed. The feature vectors obtained per image varies in a large count and the total features are highly varying in nature. This variation results in very slow convergence of feature vector and making the overall system performance lower. To make the system faster for estimation, a hierarchical clustering based estimation approach is used. 6. Hierarchical Clustering

Clustering large spatial databases is an important problem, which tries to find the densely populated regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. A good clustering approach should be efficient and detect clusters of arbitrary shape. It must be insensitive to the noise (outliers) and the order of input data. A spatial result oriented hierarchical clustering approach is proposed which builds a cluster hierarchy or, in other words, a tree of clusters, also known as a dendrogram. Every cluster node contains child clusters, sibling clusters partition the points covered by their common parent. Such an approach allows exploring data on different levels of granularity. A Hierarchical clustering based on linkage metrics is used which results in clusters of proper shapes. Active contemporary efforts to build cluster

Page 8: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

334

systems that incorporate the concept of clusters as connected components of arbitrary shape, and texture features are used for clustering. The hierarchical clustering frequently deals with the N X N matrix of distances dissimilarities or similarities between training points. It is sometimes called connectivity matrix. Linkage metrics are constructed from elements of this matrix. The requirement of keeping such a large matrix in memory is unrealistic. This can be overcome by omitting entries smaller than a certain threshold, by using only a certain subset of data representatives, or by keeping with each point only a certain number of its nearest neighbors. For example, nearest neighbor chains have decisive impact on memory consumption. A sparse matrix can be further used to represent intuitive concepts of closeness and connectivity. With the sparsified connectivity matrix we can associate the connectivity graph G = (X, E) whose vertices X are data points, and edges E and their weights are pairs of points and the corresponding positive matrix entries (Ding et al., 2001). This establishes a connection between hierarchical clustering and graph partitioning. The Hierarchical clustering initializes a cluster system as a set of singleton clusters or a single cluster of all points and proceeds iteratively with merging or splitting of the most appropriate clusters until the stopping criterion is achieved (Chris Ding and Xiaofeng, 2002). The appropriateness of a cluster for merging/splitting is carried out on the dissimilarity of clusters elements. This reflects a general presumption that clusters consist of similar points. An important example of dissimilarity between two points is the distance between them. To merge or split subsets of points rather than individual points, the distance between individual points has to be generalized to the distance between subsets. Such derived proximity measure is called a linkage metric. The type of the linkage metric used significantly affects hierarchical algorithms, since it reflects the particular concept of closeness and connectivity. Major inter-cluster linkage metrics include single link, average link, and complete link. The underlying dissimilarity measure (usually, distance) is computed for every pair of points with one point in the first set and another point in the second set. A specific operation such as minimum (single link), average (average link), or maximum (complete link) is applied to pair-wise dissimilarity measures.

1 2 1 2( , ) { ( , y) / x ,y }d C C operation d x C C= ∈ ∈ (14) Early examples include the algorithm SLINK, which implements single link, Voorhees method, which implements average link, and the algorithm CLINK, which implements complete link. Of all these, SLINK is referenced the most. It is related to the problem of finding the Euclidean minimal spanning tree and has O(N2) complexity. The methods using inter-cluster distances defined in terms of pairs with points in two respective clusters (subsets) are called graph methods. They do not use any cluster representation other than a set of points. This name naturally relates to the connectivity graph G= (X, E) introduced above, since every data partition corresponds to a graph partition. Such methods can be appended by so-called geometric methods in which a cluster is

Page 9: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

335

represented by its central point. It results in centroid, median and minimum variance linkage metrics. Under the assumption of numerical attributes, the center point is defined as a centroid or an average of two cluster centroids subject to agglomeration. In this work there are two feature vectors for the image namely shape feature vector Fs and texture feature vector Ft. For the clustering operation the similarity measurement made is as outlined below. 1) Calculate the Euclidean distance of trained shape feature vector(Fsl) and the object shape feature vector(Fsi) which is derived as:

(15)

Where qi and ki are the elements of Fsi and Fsl .

2) Calculate the Euclidean distance of learning texture feature vector and the object texture feature vector which is derived as:

(16)

Where pi and li are the elements of Fti and Ftl .

3) Calculate the weighted similarity of image and retrieve the object by two distances derived as:

(17)

Where is the weight of image similarity measurement. These clusters are then used for the segregation of image objects and categorize the query sample based on cluster distance. The suggested system with such approach is outlined in the following section. 7. System Design

The developed approach for objects recognition from spatial images is outlined in figure 1. The system is developed in two phases of operation namely Training and Testing phase. The system initially performs a training operation where the spatial image dataset is processed using multiwavelet transformation for denoising. The regions in the image are estimated using global graph theory technique. Then the regions in image are subjected to multiwavelet transformation that outputs the resolution information’s which reveals the textural and shape variations in spatial image. The obtained decomposed coefficient after multi level decomposition is averaged over the total feature length. Each

Page 10: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

336

averaged resolution is considered as a feature vector ‘F’ and are indexed in a database memory for corresponding spatial image. Clustering is then performed for the segregation of features.

Figure 1: Proposed image recognition system architecture

On querying the process is repeated for given input spatial image and the obtained features are mapped with the stored spatial image information based on Euclidian distance approach. The distance vectors are measured for minimum distance and the minimum vector index is represented as the retrieved spatial image information. A learning rule is applied to train the Knowledge Base to perform some particular task. Supervised learning and unsupervised learning are the two learning rules. In this project, supervised learning is used for training. Training and testing are performed using spatial objects in images. Here we compare feature vectors to the various objects and find the closest match. The distance measurement is used for the prediction of best match. The distance is given by,

Z (i,j) = x(i,j) – c (i,j) (18)

Where Z (i,j) is the distance matrix and x (i,j) is the feature element for trained sample and c(i,j) is feature element for test sample. The classifier outputs the classified object information based on the least distance observation.

The two processing of training and testing phase is as outlined below,

Algorithm 1: Training phase Input: Collected image observations Output: Array of indexed feature vector Begin: Step1: Training set images are taken. Step2: Perform denoising operation using multi-wavelet Transformation. Step3: Isolate the object regions into distinct regions Step4: Performs multi level scaling and decomposition on isolated regions. Step3: Compute feature vectors for the decomposed coefficient (shape/texture). Step4: Compute the average of each of the features along with the rules and store them in

Page 11: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

337

knowledge base . Step5: Perform Clustering of obtained features for complete database End

Algorithm 2: Testing Phase Input: Input image for recognition of object information. Output: Recognized objects and information. Begin: Step 1: Input test sample image Step 2: Denoise the given sample using multiwavelet Step 3: Estimate the regions from the image. Step 4: Perform multi wavelet transformations for the given query sample for Texture and Shape feature extraction. Step 5: Extract the decomposed regions and their features for the given image regions. Step 6: Perform Clustering of obtained features. Step 7: Compute the average feature values for extracted clusters. Step 8: Compare the average range feature values of test sample with the values stored in the knowledge base. Step 9: If the values are matched approximately then the test sample is classified as one of the trained sample Step 10: Comparing classified images with query image we can recognize objects with percentage of matching. End This developed system is evaluated over different case studies and various evaluative parameters were observed as outlined in next section. 8. Results and Observations

The developed system is tested for different spatial data set. The considered database is taken form ‘satiamgingcorp’ and the data base images are having different class of objects with variable colors, shape and texture features. The images were chosen to demonstrate the general applicability of the interactive recognition approach on objects of varying uniformity, size, shape and contrast.

For the evaluation of the suggested approach, first we added white gaussian noise to the image at different noise level, then the noisy image was firstly decomposed with Geronimo, Hardin, and Massopust (GHM) multiwavelets, then each coefficient vector was thresholded with the universal threshold value for denoising the image. The results reveal that, highest denoising performance is attained with multiwavelets as compared with all single wavelets. The figure 2 shows the input noised image sample, and the figure 3 shows the image after denoising using multiwavelets.

Page 12: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

338

Figure 2: Input query image with Gaussian noise effect

Figure 3: Input image after denoising employing Multiwavelets

The regions in the given image are estimated using the graph based segmentation method discussed in section IV. In this method a graph G = (N, U) has a node corresponding to each feature point (each pixel) and there is an arc (ni, nj) connecting pairs of feature points ni and nj that are nearby in the feature space. These arc links together forms a bounding region for each distinct region of the given spatial image. The bounding region predicted are then segmented using image cropping operation with the specified location coordinates. The isolated distinct regions and the segmented regions of the input image after processing are shown in figure 4 and figure 5 respectively.

Figure 4: Isolated distinct regions of input image sample

Page 13: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

339

Figure 5: Segmented regions of input image sample

The shape and texture features are developed based on Radon transformation and multiwavelet. The eigenvalue f1, f2, f3 obtained from wavelet coefficient construct a three dimensional vector ft which demonstrate the texture feature. This ft is estimated for each distinct object of the image. The obtained shape and texture features for a given query image samples is given in table 1. To make the system faster for estimation, a hierarchical clustering based estimation approach is performed on image.

Table 1: Extracted (a) Shape feature and (b) Texture feature vector for sample images

(a) (b)

1 20

2

4

6

8

10

12

methods

Computation Time(Sec)

Computation Time plot

Wavelet-estiamtionProposed-MW-estimation

Figure 6: Computation time plot for the two methods

The above figure illustrates the computational comparison of conventional recognition architecture with wavelet based approach to the proposed Multi Wavelet method. As can

Page 14: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

340

be realized from the figure 6 and table 2, multiwavelet based recognition observes lower time computation than the conventional wavelet based approach.

Figure 7: Original images and isolated segmented regions observed

Table 2: Region prediction and computation time observations obtained for different image samples.

0 5 10 15 20 25 300.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

PSNR

% of accurate estimation

region Recall rate plot

Wavelet-estiamtionProposed-MW-estimation

Figure 8: Obtained retrieval accuracy for extracted region

A Similar observation is carried out for different sample image data set and the obtained segmented regions are shown in figure 7. The obtained recall rate for the estimation

Page 15: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

341

regions and its recognition is shown in figure 8 which is much better than conventional wavelet approach. 9. Conclusion

This paper present a method for spatial image processing based on multi wavelet transformation and hierarchical clustering approach. The recognized objects efficiency is evaluated with various quality metric. From the obtained observation it is observed that the recognition accuracy of the system is higher in system having multi wavelet transformation than the conventional wavelet based approach. The texture information of the image estimated using Multi wavelet transformation exploit the resolution variation more accurately than the conventional approach. A multiwavelet based denoising, radon transformation based shape extraction and multiwavelet based texture feature extraction operation is also outlined. 10. Reference

1. Aksoy S et al., 2004: “Interactive Training of Advanced Classifiers for Mining Remote Sensing Image Archives”. In ACM International Conference on Knowledge Discovery and Data Mining. Seattle.

2. Aujol et al, 2003: “Wavelet-based level set evolution for classification of textured

images”. IEEE Trans. Image Processing, Vol 12, No 12, pp 1634-1641. 3. Chambolle A. R et al., 1998: “Nonlinear wavelet image processing: Variational

problems, compression, and noise removal through wavelet shrinkage”. IEEE Trans. on Image Processing, vol.7, pp. 319–335, 1998.

4. Chris Ding and Xiaofeng He., 2002: "Cluster merging and splitting in hierarchical

clustering algorithms", Second IEEE Int Conf on Data Mining., pp 139-148. 5. Curlander J and Kober W., 1992: “Rule based system for thematic classification in

SAR imagery”. Proc. IGARSS. IEEE Press, New York, pp 854– 856. 6. Ding C et al., 2001: “A min-max cut algorithm for graph partitioning and data

clustering”. Proc. IEEE Intl Conf. on Data Mining pp 107–114. 7. Grace S et al., 2000: “Spatially adaptive wavelet thresholding with context modeling

for image denoising”. IEEE Trans. on Image Processing, vol. 9, pp 1522–1531. 8. Heene G and Gautama S., 2000: “Optimisation of a coastline extraction algorithm for

object-oriented matching of multisensor satellite Imagery”. Proc. IGARSS, IEEE Press, vol. 6, New York pp 2632–2634.

9. Javier Portilla et al., 2003: “Image Denoising Using Scale Mixtures of Gaussians in

Wavelet Domain” IEEE Trans on Image Processing, VOL.12, No.11, pp 1338- 1350.

Page 16: High Resolution Data Processing for Spatial Image Data Mining · 2. Spatial Image Processing Spatial Image Processing can be defined as the act of examining spatial images for the

INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 1, No 3, 2010

© Copyright 2010 All rights reserved Integrated Publishing services

Research article ISSN 0976 – 4380

342

10. Jiawei Han et al., 1992: “Knowledge Discovery in Databases: An Attribute-Oriented

Approach”. Proceedings of the 18th VLDB Conference, Canada. 11. Jitendra Malik et al., 2001: “Contour and Texture Analysis for Image Segmentation”.

International Journal of Computer Vision 43(1), pp 7–27. 12. Loncaric S 1998: “A survey of shape analysis techniques,” Pattern Recognition,

Elsevier. Vol. 31, No. 8. pp 983–1001. 13. Peter Howarth et al., 2004: “Evaluation of Texture Features for Content-Based Image

Retrieval” Third International conference CIVR, Ireland, Springer, pp 326–334. 14. Rushing, J et al., 2005: “ADaM: A Data Mining Toolkit for Scientists and Engineers”.

Computers and Geosciences, In Press, Available online 11 January. 15. Schroder M et al., 2000: “Interactive Learning and Probabilistic Retrieval in Remote

Sensing Image Archives”. IEEE Trans. on Geoscience and Remote Sensing. Vol 23(9), pp 2288-2298.

16. Starck J et al., 2002: “The curvelet transform for image denoising”. IEEE Trans. On

Image Processing, vol. 11, pp 670–684. 17. Tao Li et al., 2002: “A survey on wavelet applications in data mining”. ACM

SIGKDD Vol 4, Issue 2, pp 49 – 68. 18. Tsatsoulis C., 1993: “Expert systems in remote sensing applications”. IEEE

Geoscience and Remote Sensing Newsletter, pp 7 –15. 19. Unser M., 1995: “Texture classification and segmentation using wavelet frames”.

IEEE Trans. Image Processing, Vol 4 No. 11, pp 1549-1560. 20. Veltkamp R C and M. Hagedoorn., 1999: “State of the art in shape matching” Utrecht

University, The Netherlands, Tech. Rep, UU-CS-1999-27. 21. Yuriet al., 2001: “Interactive Graph Cuts for Optimal Boundary & Region

Segmentation of Objects in N-D Images”. Proceedings of International Conference on Computer Vision, Vancouver, Canada, vol.I. pp105-112.

22. Zhang D and G. Lu., 2004: “Review of shape representation and description

techniques” Pattern Recognition, Elsevier, Volume 37, Issue 1, pp 1-19. 23. http://www.gisdevelopment.net/tutorials/tuman005.htm.(Accessed on 02-10-2010).