ConQuer: Efficient Management ConQuer: Efficient Management of Inconsistent Databases of Inconsistent Databases Presented by: Presented by: Ariel Fuxman (Univ. of Ariel Fuxman (Univ. of Toronto) Toronto) Joint work with: Joint work with: Renée J. Miller (Univ of Toronto) Renée J. Miller (Univ of Toronto) Diego Fuxman (Univ. Nacional del Sur) Diego Fuxman (Univ. Nacional del Sur)
15
Embed
ConQuer: Efficient Management of Inconsistent Databases Presented by: Presented by: Ariel Fuxman (Univ. of Toronto) Ariel Fuxman (Univ. of Toronto) Joint.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ConQuer: Efficient ConQuer: Efficient Management of Inconsistent Management of Inconsistent
DatabasesDatabases
Presented by: Presented by:
Ariel Fuxman (Univ. of Toronto)Ariel Fuxman (Univ. of Toronto)
Joint work with: Joint work with:
Renée J. Miller (Univ of Toronto)Renée J. Miller (Univ of Toronto)
Diego Fuxman (Univ. Nacional del Sur)Diego Fuxman (Univ. Nacional del Sur)
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 22
A system designed to answer SQL A system designed to answer SQL queries over queries over inconsistent inconsistent databasesdatabases
ConQuerConQuer
130K130KMaryMary
110K110KMaryMary
400K400KPaulPaul
200K200KPeterPeter
40K40KPeterPeter
IncomeIncomeNameName
namename should be theshould be the keykey
INCONSISTENT DATABASEINCONSISTENT DATABASE
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 33
db2admin5/2/2005As a motivation, let's focus on a domain known in IT as CRM -- Customer Relationship Management. One of the goals of CRM is to integrate customer information from such disparate sources as ..... This domain is of interest to us because customer data is notoriously dirty and inconsistnet.
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 44
Disagreement Between Disagreement Between SourcesSources
Which tuple for Which tuple for PeterPeter should we delete? should we delete?• Removing both tuples loses consistent informationRemoving both tuples loses consistent information
• Deciding the correct income may require human Deciding the correct income may require human interventionintervention
no matter which repair no matter which repair we choosewe choose
Consistent Query AnswersConsistent Query Answers
PeterPeter 40K40K
PaulPaul 400K400K
MaryMary 110K110K
PeterPeter 40K40K
PaulPaul 400K400K
MaryMary 130K130K
PeterPeter 200K200K
PaulPaul 400K400K
MaryMary 110K110K
PeterPeter 200K200K
PaulPaul 400K400K
MaryMary 130K130K
q=q=“Get customers who make more than 100K”“Get customers who make more than 100K”
qq
qq
qq
qq
CONSISTENT CONSISTENT ANSWER=ANSWER=
{Paul,Mary}{Paul,Mary}
RepairsRepairs
MaryMary
PaulPaul
PeterPeter
MaryMary
PaulPaul
MaryMary
PaulPaul
MaryMary
PaulPaul
PeterPeter
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 1010
ProblemProblem
Potentially HUGE number of repairs!Potentially HUGE number of repairs!
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 1111
ConQuerConQuer
ConQuer is a system ConQuer is a system designeddesigned to to compute consistent answers compute consistent answers efficiently efficiently •avoids explicit construction of repairsavoids explicit construction of repairs
These restrictions are not arbitrary since it is known that there are some SPJ queries for which there is no SQL rewriting
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 1414
DemoDemo
Present a case study of an Present a case study of an inconsistent database about airports inconsistent database about airports and citiesand cities
Explain the automatically generated Explain the automatically generated rewritingsrewritings
Deal with Select-Project-Join queries Deal with Select-Project-Join queries with grouping and aggregationwith grouping and aggregation
Ariel Fuxman, Diego Fuxman, Renée J. MillerAriel Fuxman, Diego Fuxman, Renée J. Miller 1515
ConQuer papersConQuer papers
A. Fuxman, E. Fazli, and R. J. Miller. A. Fuxman, E. Fazli, and R. J. Miller. ConQuer: Efficient Management of ConQuer: Efficient Management of Inconsistent DatabasesInconsistent Databases, SIGMOD , SIGMOD 2005.2005.
A. Fuxman and R. J. Miller. A. Fuxman and R. J. Miller. First-First-Order Query Rewriting for Order Query Rewriting for Inconsistent DatabasesInconsistent Databases, ICDT 2005., ICDT 2005.