Top Banner
J. Filipe and J. Cordeiro (Eds.): ICEIS 2009, LNBIP 24, pp. 415–426, 2009. © Springer-Verlag Berlin Heidelberg 2009 NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies Rafael Garcia Miani, Cristiane A. Yaguinuma, Marilde T.P. Santos, and Mauro Biajiz Department of Computer Science, Federal University of São Carlos (UFSCar) P.O. Box 676, 13565-905 São Carlos, Brazil {rafael_miani,cristiane_yaguinuma,marilde,mauro}@dc.ufscar.br Abstract. Traditional approaches for mining generalized association rules are based only on database contents, and focus on exact matches among items. However, in many applications, the use of some background knowledge, as ontologies, can enhance the discovery process and generate semantically richer rules. In this way, this paper proposes the NARFO algorithm, a new algorithm for mining non-redundant and generalized association rules based on fuzzy ontologies. Fuzzy ontology is used as background knowledge, to support the discovery process and the generation of rules. One contribution of this work is the generalization of non-frequent itemsets that helps to extract important and meaningful knowledge. NARFO algorithm also contributes at post-processing stage with its generalization and redundancy treatment. Our experiments showed that the number of rules had been reduced considerably, without redundancy, obtaining 63.63% average reduction in comparison with XSSDM algorithm. Keywords: Data Mining, Generalized Association Rules, Redundant Rules, Fuzzy Ontology. 1 Introduction Data mining is a key step of knowledge discovery in large databases [1]. One important topic in data mining research is concerned with the discovery of interesting association rules [2]. Many approaches on mining association rules are motivated by finding new ways of dealing with different attribute types or increasing computational performance. In the mean time, a crescent number of approaches have been developed regarding semantics of mined data, aiming at improving the quality of obtained knowledge. In this sense, ontologies have been widely employed to represent semantic information defined by knowledge experts, and can also be applied to enhance association rule mining. Some researches [3] [4] have extended the process of mining association rules in order to obtain rules that represent relation between basic data items, as well as between items at any level of the related taxonomy (is-a hierarchies) or ontology, resulting in the so called generalized association rules. The use of traditional ontologies based on propositional logic (crisp ontologies), which considers exact reasoning discriminated on “false” or “true”, is a common
12

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

May 16, 2023

Download

Documents

karina tarda
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

J. Filipe and J. Cordeiro (Eds.): ICEIS 2009, LNBIP 24, pp. 415–426, 2009. © Springer-Verlag Berlin Heidelberg 2009

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

Rafael Garcia Miani, Cristiane A. Yaguinuma, Marilde T.P. Santos, and Mauro Biajiz

Department of Computer Science, Federal University of São Carlos (UFSCar) P.O. Box 676, 13565-905 São Carlos, Brazil

{rafael_miani,cristiane_yaguinuma,marilde,mauro}@dc.ufscar.br

Abstract. Traditional approaches for mining generalized association rules are based only on database contents, and focus on exact matches among items. However, in many applications, the use of some background knowledge, as ontologies, can enhance the discovery process and generate semantically richer rules. In this way, this paper proposes the NARFO algorithm, a new algorithm for mining non-redundant and generalized association rules based on fuzzy ontologies. Fuzzy ontology is used as background knowledge, to support the discovery process and the generation of rules. One contribution of this work is the generalization of non-frequent itemsets that helps to extract important and meaningful knowledge. NARFO algorithm also contributes at post-processing stage with its generalization and redundancy treatment. Our experiments showed that the number of rules had been reduced considerably, without redundancy, obtaining 63.63% average reduction in comparison with XSSDM algorithm.

Keywords: Data Mining, Generalized Association Rules, Redundant Rules, Fuzzy Ontology.

1 Introduction

Data mining is a key step of knowledge discovery in large databases [1]. One important topic in data mining research is concerned with the discovery of interesting association rules [2]. Many approaches on mining association rules are motivated by finding new ways of dealing with different attribute types or increasing computational performance. In the mean time, a crescent number of approaches have been developed regarding semantics of mined data, aiming at improving the quality of obtained knowledge. In this sense, ontologies have been widely employed to represent semantic information defined by knowledge experts, and can also be applied to enhance association rule mining. Some researches [3] [4] have extended the process of mining association rules in order to obtain rules that represent relation between basic data items, as well as between items at any level of the related taxonomy (is-a hierarchies) or ontology, resulting in the so called generalized association rules.

The use of traditional ontologies based on propositional logic (crisp ontologies), which considers exact reasoning discriminated on “false” or “true”, is a common

Page 2: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

416 R.G. Miani et al.

point in approaches to generalize association rules. Such restriction becomes inappropriate to represent some concepts and relationships of the real world. For example, it is difficult to represent in crisp ontologies concepts such as "young", "old", "high" or "low" as well as fuzzy relationships like the similarity relation [5], which has a degree representing the strength of how concepts are similar to each other. Hence, the combination of ontologies and fuzzy logics, based on the theory of fuzzy sets [6], is suitable to express the uncertainty inherent in specific domains.

Therefore, some researches have used fuzzy ontologies in order to extract semantically richer association rules [7] [8]. However, in general, by adopting either crisp or fuzzy ontologies in generalized association rule mining, the process of extracting association rules usually brings to users a great amount of rules that represent the same information. Thus, it is interesting to avoid mining unnecessary rules that express redundant knowledge, while focusing on the semantic richness provided by fuzzy ontologies.

Considering this context, this paper proposes the NARFO (Non-redundant Association Rule based on Fuzzy Ontologies) algorithm to mine non-redundant and generalized association rules based on fuzzy ontologies. The main contribution of our research is the generalization of non-frequent itemsets, in addition to the generalization of the extracted rules and the redundancy treatment, all considering fuzzy itemsets.

The remainder of this paper is organized as follows. In section 2, we present related work. Section 3 explains the NARFO algorithm and shows its main characteristics. Performed experiments are showed in section 4. Finally, section 5 brings some conclusions and future work.

2 Related Work

Many researches have been employing taxonomies and ontologies as background knowledge in mining association rules in order to enhance the knowledge discovery process. [4] considers domain knowledge to generalize low level rules discovered by traditional rule mining algorithms, in order to get fewer and clearer high level rules. ExCIS algorithm [9] applies domain knowledge in pre and post-processing steps. The preprocessing step uses ontology to guide the construction of specific datasets for particular mining tasks. In the post-processing step, mined rules are interpreted and filtered, as terms are generalized based on the ontology.

“However, in many real-world applications, the taxonomic structure may not be crisp but fuzzy.” [1]. For this purpose, [1] developed an algorithm to mine generalized association rules with fuzzy taxonomic structures. In addition to the minimum support (minsup) and minimum confidence (minconf) measures, the algorithm considers the R-interest measure, which is used to eliminate redundant and inconsistent rules. Fuzzy association rule mining, developed by [2], is driven by domain knowledge in order to make the rules more visual, more interesting and understandable. Database attributes are mapped as linguistic variables, which are divided into linguistic terms. For example, attributes like age, education and skill (linguistic variables) concern to the high-level concept person (linguistic term) of the ontology. XSSDM algorithm [8] proposes another approach that uses fuzzy ontologies to represent the semantic similarity relations among mined data. This algorithm considers a new measure, called minimum similarity (minsim). If two items have the similarity degree greater

Page 3: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules 417

than or equal the minsim, fuzzy associations are made and can be expressed in the association rules extracted by the algorithm. For example, if item1 and item2 have the degree of similarity greater than or equal minsim in the fuzzy ontology, a fuzzy association is made and a fuzzy itemset is created (represented as item1~item2).

Although generalized association rule mining approaches based on fuzzy ontology express semantically richer information, they may result in a great amount of redundant rules. Thus, redundancy treatment has been an interesting research topic. In [10] a multiple level association rule was proposed to reduce the number of generalized association rules. This consists in defining different minsup values for each level of a given taxonomy, in which higher levels have bigger minsup values. Some approaches focus on reducing the amount of generalized and redundant association rules during the pattern extraction process. The cSET algorithm [11] considers the concept of closed itemset [12]. The MFGI_class algorithm [13] is based on maximal frequent itemset theory [14]. There are other researches that treat the problem after the processing stage. [1] proposes the generalization process based on the R-interest measure, which prunes redundant rules, only considering the rules whose degree of support and confidence are R times the expected degree of support or confidence. So, the rules are generalized if their support and confidence are R times the expected minsup and minconf. GARPA algorithm [15] generalizes only if the descendents of an ancestor generate rules, and the rule of the ancestor has support value x% greater than the descendent that generate a rule with the biggest support among its siblings.

Table 1 compares the approaches mentioned in this section with NARFO algorithm. The presence of X in a cell indicates that the approach considers a specific feature.

Table 1. Comparison among approaches

Approaches Generalized Association

Rule

Redundancy Treatment

Fuzzy Association

Rules

Generalization of non-frequent

itemsets

Ontology

[10] X X [11] X X [15] X X [1] X X X [8] X X X [4] X X [9] X X X [2] X X [13] X X X

NARFO X X X X X

3 NARFO Algorithm

The NARFO algorithm extends and enhances the XSSDM algorithm [8] in some aspects. The first improvement is the generalization process, by including the generalization of infrequent itemsets, which is similar to the maximal frequent itemset technique cited in section 2. NARFO algorithm also performs a post-processing analysis, when rules are generated, by generalizing the rules that have all descendants

Page 4: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

418 R.G. Miani et al.

of the same immediate ancestor. Considering the fuzzy ontology of figure 1, if Tomato Chicken and Cabbage Chicken are generated rules, then the generalization of these rules is made, resulting in Vegetable Chicken, since Tomato and Cabbage represent all descendants of Vegetable concept. Another relevant feature is the redundancy elimination, considering association rules like Apple~Tomato Chicken and Apple Chicken.

In figure 2, the steps of the algorithm are illustrated. The steps highlighted are the ones where generalization and redundancy elimination are performed. In the sections bellow, the steps of the NARFO algorithm are explained. Steps from 1 to 4, and step 6 are similar to the respective ones in XSSDM.

3.1 Data Scanning

This step identifies the items in the database generating itemsets of one size (1-itemst). These items have a straight correspondence to leaf-nodes of the fuzzy-ontology and, consequently, similarity relations to their siblings can be easily found. Considering the fuzzy ontology of figure 1, this step identified the following items: Apple, Kaki, Tomato, Cabbage, Chicken, Turkey and Sausage.

3.2 Identifying Similar Items

The similarity degree values between items are supplied by a fuzzy ontology, which specifies the semantics of the database contents. This step navigates through the fuzzy ontology structure to identify semantic similarity between items. If the similarity degree between items is greater than or equal to the minsim parameter explained in section 2, a semantic similarity association is found and this association is considered similar enough. A fuzzy association of size 2 is made by these pair of items found and are expressed by fuzzy items, where the symbol ~ indicates the similarity relation between items, for example, Kaki~Tomato.

After that, this step verifies the presence of similarity cycles as proposed in [16]. These are fuzzy associations of size greater than 2 that only exists if the items are, in pairs, sufficiently similar. The similarity cycles involve only leaf nodes of the ontology. The minimum size of a cycle is 3, and the maximum is the number of sibling leaf nodes.

Fig. 1. Example of Fuzzy Ontology with fuzzy similarity degrees between items

Page 5: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules 419

In the end of this step, all fuzzy associations and similarity cycles are found and can be used to generate the rules.

Fig. 2. Steps of NARFO algorithm

3.3 Generating Candidates

The generation of candidates in this algorithm is similar to Apriori algorithm. However, in NARFO algorithm, besides the items identified in the step described in section 3.1, fuzzy items, which represent fuzzy associations, are also added to generated candidates.

At the end of this step, we have all itemsets candidates of size k that are sent to the step described in section 3.4.

3.4 Calculating the Weight of Candidates

The weight of candidates is calculated based on the Fuzzy weight equation proposed in [16], if a candidate itemset is fuzzy. This equation was created due to the presence of fuzzy logic concepts.

The weight reflects the number of occurrences of an itemset in the database. In this step, the database is scanned, and each of its rows is confronted with the set of candidate itemsets, one after one. For each occurrence of a non-fuzzy candidate itemset in a row, its weight is incremented by 1. Otherwise, it is incremented by the fuzzy value, which is calculated based on the Fuzzy weight equation.

Finished this step, the itemsets candidates are ready to be used in next step.

3.5 Evaluating Candidates

This step is similar to the corresponding one in Apriori algorithm. The support of each itemset is evaluated. However, during the process, besides verifying if each candidate

Page 6: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

420 R.G. Miani et al.

itemset is frequent or not (has support greater than or equal minsup), the algorithm verifies the non-frequent itemsets, step which is not verified by neither Apriori nor XSSDM algorithms. If all descendents of an ancestor are non-frequents, but if the sum of the descendents’ support is greater than or equal minsup, the NARFO algorithm generalizes the descendents to its ancestor and add the generalized itemset to the set of frequent itemsets. For example, considering the fuzzy ontology in figure 1, if ((Kaki), (Apple), (Tomato)) are non-frequent itemsets, the algorithm sum its supports. If the result is greater than or equal minsup, the generalization of the itemsets is done, and the ancestor Fruit is added to the set of frequent itemsets. This represents a meaningful and significant knowledge that was not extracted before by Apriori and other algorithms, since they do not consider the generalization of infrequent itemsets. It is one of the main contributions of this work. The pseudo-algorithm of this technique is showed in table 2.

Table 2. Pseudo-algorithm for generalizing non-frequent itemsets

1 2 3 4 5 6 7 8 9 10

for each ancestor for each non-frequent itemset of size k if itemset belongs to ancestor sum support and increment counter end if end for if counter is equal to the number of ancestor’s child add ancestor to frequent itemset // generalization end if end for

In table 2, lines from 2 to 6 count the number of descendents of an ancestor and

sum their supports. Line 7 checks if all descendents of an ancestor are non-frequent itemsets, and add the ancestor (generalized itemset) to the set of frequent itemsets in line 8. This generalization can also be applied for itemsets with size greater than 1. For example, if ((Kaki, Turkey), (Apple, Turkey), (Tomato, Turkey)) are non-frequent itemsets, the generalization (Fruit, Turkey) is made since all descendents of Fruit composes the non-frequent itemsets and the other item, Turkey, belongs to all those non-frequent itemsets.

The support of an itemset is the weight divided by the number of transactions in the database. If the item set is fuzzy, the algorithm divides the fuzzy weight to the total of transactions. If a candidate itemset is frequent, its support is greater than or equal minsup.

In the end of this step, we have all frequents itemsets of size k, where k is the size of the itemset. The algorithm returns to the step explained in section 3.3 to generate the candidate itemsets of size k + 1. If k is equal to the number of domains, NARFO algorithm goes to step 3.6.

3.6 Generating Rules

In this step, the association rules are generated for each itemset of the set of frequent itemsets. All possibilities of antecedents and consequents are generated. The rules that have the confidence value greater than or equal minconf are considered strong.

Page 7: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules 421

Remember that, the confidence of a rule is given by the support of the rule divided by the support of the antecedent.

Then, the algorithm verifies these rules in order to check if a fuzzy item can be generalized. A fuzzy item can be generalized if all descendents of an ancestor are contained in the fuzzy item. If the rule Tomato~Cabbage Turkey exists, then the rule Vegetable Turkey is generated, since Vegetable comprises Tomato and Cabbage as descendant concepts.

After that, all the rules generated are sent to the next step to verify other possible generalization and redundant rules.

3.7 Generalizing and Treating Redundancy

After all rules have been generated by the previous step, they receive generalizing and redundancy treatment. The corresponding pseudo-algorithm of these treatments is showed in table 3.

For all rules, the algorithm verifies if the antecedent / consequent of each rule can be generalized (lines 1-6 in table 3). For example, if the algorithm generated the rules Kaki Sausage and Apple~Tomato Sausage, then the algorithm does the generalization to Fruit Sausage because all descendants of Fruit occurred in rules (including fuzzy items) with the Sausage concept as consequent. The generalization process happens not only if all descendents of an ancestor are in a fuzzy item, but also if all descendents of an ancestor are in different rules that have the same corresponding antecedent or consequent. This can happen at the antecedent or consequent of a rule.

In the redundancy treatment, the algorithm deals with two redundancy issues. The first one is when a rule is sub-rule of another one (lines from 7 to 11 in table 3). A sub-rule r1 is a rule that has the same items in antecedent and consequent considering another rule r2, except that r1 has at least one item that is descendent of an item in r2, in the same side of the rule (antecedent or consequent). Then, r1 is eliminated. For example, the rule Kaki Sausage is a sub-rule of Fruit Sausage as Kaki is descendent of the ancestor Fruit. Then, the rule Kaki Sausage is eliminated.

Table 3. Pseudo-algorithm of the generalization and redundancy treatment

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

for each rule if rule can be generalized //antecedent or consequent generalize rule end if add rule to v // v is a vector of rules end for for each rule from v if rule is not a sub-rule add to v1 // v1 is auxiliary vector of rules end if end for for each rule from v1 if rule is not a fuzzy sub-rule add to v2 // v2 is auxiliary vector of rules end if end for

Page 8: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

422 R.G. Miani et al.

The other redundancy treatment is the fuzzy redundancy (lines 12-16 in table 3). This kind of redundancy occurs when a rule is fuzzy sub-rule of another rule that contain a fuzzy item. A fuzzy sub-rule r1 is a rule that has the same items, in antecedent and consequent of another rule r2, except that r1 has at least one item that is contained in a fuzzy item, in the same side, in r2. For example, if the algorithm generates the rules Apple Sausage and Apple~Tomato Sausage, the first one is pruned. Only Apple~Tomato Sausage is showed to the user. When this case happens, the resultant rule is showed to the user in this format: Apple~Tomato Sausage with sup: 0.3 and conf: 0.45 (Item ‘Apple’ has more relevance). The phrase between the parentheses is showed in order to highlight the item in the fuzzy item that has more relevance, once there was a redundant fuzzy sub-rule with this relevant item.

4 Experiments

In this section we show some experiments performed to validate the NARFO algorithm.

In these experiments, we have considered data from the Brazilian Demographic Census 2000, provided by IBGE (Brazilian Institute of Geography and Statistics). Two databases containing information about demographic characteristics of the Brazilian population were analyzed: IBGE1, which contains information about Years of study, Race or ethnicity and Sex; and IBGE2, containing relations between Race or ethnicity and Living (urban or rural).

After analyzing the data and domain of demographic characteristics, the fuzzy ontology of figure 3 was created. The ontology was modeled in OWL (Web Ontology Language) and Jena Framework [17] was used to allow navigation through ontology concepts and relations, making NARFO algorithm able to obtain similar items and corresponding similar degrees.

The tests were done with constant minconf and minsim, whose values are, respectively, 0.2 and 0.1. The minsup value varies from 0.05 to 0.5, increasing it by 0.05.

Two tests (Test A and Test B) were done and compared with the XSSDM algorithm, regarding IBGE1 database. Test A considers the generalization and the redundancy treatment described in section 3.7, and compare the number of association

Fig. 3. Fuzzy Ontology of Demographic Characteristics

Page 9: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules 423

rules generated in NARFO and XSSDM algorithms. The aim of Test A, without the generalization of non-frequent itemsets, is to confirm that NARFO algorithm extracts fewer association rules in comparison to XSSDM. With minsup = 0.05, NARFO generates 46 rules against 91 of XSSDM (49.45% of reduction). With minsup = 0.25, 4 rules were generated by our algorithm against 6 of XSSDM, and both algorithms do not generate any rule with minsup = 0.3. The results are illustrated in Figure 4.

0

10

20

30

40

50

60

70

80

90

100

0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05

minsup

Num

ber

of r

ules

XSSDM

NARFO reduction:49,45%

Fig. 4. NARFO only with generalization and redundancy treatment (Test A)

Test B compares the number of association rules extracted by NARFO considering all the techniques described in this paper, including the generalization of non-frequent itemsets described in section 3.5 and XSSDM. The results of Test B are illustrated by Figures 5 and 6. Figure 5 shows that NARFO algorithm generates fewer rules than XSSDM algorithm when considering low support values, because NARFO is able to eliminate redundant rules.

0

10

20

30

40

50

60

70

80

90

100

0.15 0.1 0.05

minsup

Nu

mb

er o

f ru

les

XSSDM

NARFO

Fig. 5. Full NARFO with low minsup values (Test B)

In Figure 6, NARFO generates more rules than XSSDM when the minsup is between 0.2 and 0.5. This happens because NARFO algorithm generalizes descendents of an ancestor if all of them are non-frequent, resulting in relevant rules that represent new and meaningful knowledge that was not found by the XSSDM algorithm, representing an important contribution of this work.

Page 10: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

424 R.G. Miani et al.

0

2

4

6

8

10

12

14

16

18

20

0.5 0.45 0.4 0.35 0.3 0.25 0.2

minsup

Nu

mb

er o

f ru

les

XSSDM

NARFO

Fig. 6. Full NARFO with high minsup values (Test B)

We have also tested both algorithms with IBGE2 database regarding information on Race or ethnicity and Living (urban or rural). The results are shown in Figure 7. For IBGE2 database, NARFO reduces the number of generated rules in 63.63% (4 rules against 11 of XSSDM), when support is below 0.3. Therefore, the generalization and redundancy treatment performed by NARFO is able to prune irrelevant rules that are generally obtained when considering low minsup values.

0

2

4

6

8

10

12

0,5 0,45 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05

minsup

Nu

mb

er o

f ru

les NARFO

XSSDM reduction:63,63%

Fig. 7. Full NARFO tested for IBGE2 database

To sum up, we could observe two distinct situations from the results. When considering low minsup values, NARFO reduces the number of rules because it performs generalization and removes redundant association rules without lost in semantics, comparing to the XSSDM algorithm. On the other hand, when minsup is increased to high values, NARFO algorithm produces the same amount or more meaningful rules, due to the generalization of non-frequent itemsets. In this case, the additional rules express relevant knowledge, since they are based on the semantic concepts and relationships of the fuzzy ontology.

5 Conclusions and Future Work

This paper proposes a new algorithm, called NARFO algorithm, for mining non-redundant and generalized association rules based on fuzzy ontologies. Our algorithm

Page 11: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

NARFO Algorithm: Mining Non-redundant and Generalized Association Rules 425

does an efficient generalization and redundancy treatment without lost of information. Experiments have demonstrated that irrelevant rules are pruned, resulting in a smaller amount of redundant rules.

Furthermore, during the evaluation of candidates, NARFO performs generalization when all descendents of an ancestor are non-frequent itemsets, which provides meaningful knowledge, as stated by the tests. Hence, it is possible to obtain more relevant rules based on semantic information of fuzzy ontologies, therefore enhancing the knowledge discovery process.

We are implementing some improvements of the NARFO algorithm. For example, we intend to include a new parameter for generalization. Considering this, if X% of descendents are included in rules, the generalization is done. The algorithm will also show the descendents that do not generate a rule, in order to avoid showing mistaken information for users.

Acknowledgements. This work has been supported by the following Brazilian research agencies: CAPES, CNPq, FAPESP, FINEP and INEP. The first two authors also thank the support of the Web-PIDE Project in the context of the Observatory of the Education of the Brazilian Government.

References

1. Chen, G., Wei, Q., Kerre, E.E.: Fuzzy Data Mining: Discovery of Fuzzy Generalized Association Rules. In: Bordogna, G., Pasi, G. (eds.) Recent Issues on Fuzzy Databases, pp. 45–66. Physica-Verlag, Wurzburg (2000)

2. Farzanyar, Z., Kangavari, M., Hashemi, S.: A New Algorithm for Mining Fuzzy Association Rules in the Large Databases Based on Ontology. In: Sixth IEEE International Conference on Data Mining – Workshops, Hong Kong, China, December 18-22 (2006)

3. Srikant, R., Agrawal, R.: Mining Generalized Association Rules. In: Proceedings of the International Conference of Very Large Data Bases, Zurich, Suíça, September 11-15 (1995)

4. Hou, X., Gu, J., Shen, X., Yan, W.: Application of Data Mining in Fault Diagnosis Based on Ontology. In: 3th Third International Conference on Information Technology and Applications, Sydney, Australia, July 4-7 (2005)

5. Zadeh, L.: Similarity Relations and Fuzzy Orderings. In: Yager, R.R., Ovchinnikov, S., Tong, R.M., Nguyen, H.T. (eds.) Fuzzy Sets and Applications: Select Papers by L. A. Zadeh, pp. 81–104. Wiley Interscience, New York (1987a)

6. Zadeh, L.: Fuzzy Sets. In: Yager, R.R., Ovchinnikov, S., Tong, R.M., Nguyen, H.T. (eds.) Fuzzy sets and applications: Selected Papers by L.A. Zadeh, pp. 29–44. Wiley-Interscience, New York (1987b)

7. Chen, X., Zhou, X., Scherl, R.B., Geller, J.: Using an Interesting Ontology for Improved Support in Rule Mining. In: 5th InternationalConference on Data Warehousing and Knowledge Discovery, Prague, Czech Republic, September 3-5 (2003)

8. Escovar, E.L.G., Yaguinuma, C.A., Biajiz, M.: Using Fuzzy Ontologies to Extend Semantically Similar Data Mining. In: 21st Brazilian Symposium of Databases, Florianópolis, Brazil, October 16-20 (2006)

9. Brisson, L., Collard, M., Pasquier, N.: Improving Knowledge Discovery Process Using Ontologies. In: International Workshop on Mining Complex Data, Houston, USA, November 27-30 (2005)

Page 12: NARFO Algorithm: Mining Non-redundant and Generalized Association Rules Based on Fuzzy Ontologies

426 R.G. Miani et al.

10. Han, J., Fu, Y.: Mining Multiple-Level Association Rules in Large Databases. IEEE Transactions on Knowledge and Data Engeneering 11(5), 798–805 (1999)

11. Sriphaew, K., Theeramunkong, T.: Fast algorithms for mining generalized frequent patterns of generalized association rules. IEICE Transactions on Information and Systems E87-D(3), 761–770 (2004)

12. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Proceedings of the International Conference on Database Theory, Jerusalém, Israel, January 10-12 (1999)

13. Kunkle, D., Zhang, D.H., Cooperman, G.: Mining Generalized Frequent Itemsets and Generalized Association Rules Without Redundancy. Journal of Computer Science and Technology 23(1), 77–102 (2008)

14. Bayardo, J.R.J.: Efficiently mining long patterns from databases. In: Proceedings of the 1998 Annual Conference on Management of Data, Seattle, USA, June 2-4 (1998)

15. Oliveira, V.C., Rezende, S.O., Castro, M.: Evaluating Generalized Association Rules Through Objective Measures. In: Proceedings of 25th International Multi-Conference on Artificial Intelligence and Applications, Innsbruck, Austria, February 12-14 (2007)

16. Escovar, E.L.G., Biajiz, M., Vieira, M.T.P.: SSDM: A Semantically Similar Data Mining Algorithm. In: 20th Brazilian Symposium of Databases, Uberlândia, Brazil, October 3-7 (2005)

17. Carrol, J.J., Dickinson, I., Dollin, C., Reynolds, D., Seaborn, A., Wilkinson, K.: Jena: implementing the semantic web recommendations. In: International World Wide Web Conference, New York, USA, May 19-21 (2004)