Design of Declarative Graph Query Design of Declarative Graph Query Languages: On the Choice between Value, Languages: On the Choice between Value, Pattern and Object based Representations Pattern and Object based Representations for Graphs for Graphs Hasan Jamil Department of Computer Science Wayne State University IEEE ICDE Workshop on Graph Data Management Workshop Washington DC April 5, 2012
35
Embed
Design of Declarative Graph Query Languages: On the Choice
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Design of Declarative Graph Query Design of Declarative Graph Query
Languages: On the Choice between Value, Languages: On the Choice between Value,
Pattern and Object based Representations Pattern and Object based Representations
for Graphsfor Graphs
Hasan JamilDepartment of Computer ScienceWayne State University
IEEE ICDE Workshop on Graph Data Management Workshop
Washington DC
April 5, 2012
BiologicalBiological
NetworksNetworks
OutlineOutline
� Algorithm v Query Language
◦ GraphQL (He and Singh 2008)
◦ SAGA (Tian et al, 2007), TALE (Tian and Patel, 2008), GADDI (Zhang, Li and Yang, 2009), NOVA (Zhu et al, 2010), etc.
� NetQL
◦ Subgraph isomorphism (ICTAI 2009, SAC 2011)
� IsoKEGG (BIBM 2010)
◦ Graph reachability (CIKM 2010)
◦ Top-k similar graphs (TCBB in press)
◦ Network extraction (SAC 2010)
Many incarnations of graphsMany incarnations of graphs
Subgraph isomorph of subgraph Subgraph isomorph of subgraph
� The query to find the set of graphs Gsuch that each g� G has a subgraph isomorph in another set of graphs G’such that a query graph q is a subgraph isomorph of each g’� G’.
◦ The answer is g4.
� Cannot be computed in GraphQL (He and Singh, SIGMOD 2008) or GraphGrep (Giugno and Shasha, ICPR 2002), for example.
Research issuesResearch issues
� Computing this query requires a higher order query language◦ Variables need to range over set of tuples or structures◦ Completeness is at risk◦ Higher processing cost is also expected
� Compromise?◦ Develop operators such as� SQL aggregates� Data cube� Skyline � Association rule mining
NyQL SolutionNyQL Solution
Main IssuesMain Issues
� Representation that helps◦ Compare arbitrary graphs as single, perhaps complex, objects� Values
� No unified view of a graph
� Patterns� Need enumeration, GraphQL. Query limitations.
� Objects� In the form of a special tuple� Allows access to a whole graph through a handle� Allows comparing whole graphs without pattern enumeration
� But� Its higher order� Model graph comparisons as operators
TechnicalityTechnicality
� Represent each graph as a pair <I,<V,E>> where I is a graph ID or handle, V is the set of vertices in graph I, and E is the set of edges.◦ Extension for labeled graphs� <I,<<V,Lv>, <E,Le>>>
◦ Extension for directed graphs� Enforce symmetry for E v no symmetry
� Define graph operators that satisfy this structure, undefined otherwise
� Consequence?◦ Any relation can be restructured to represent graphs
DependenciesDependencies
Relation pairs: nodes and edges
X ⊆ nodes(R), IN ⋲ R, IN → X, N
discriminator of INXY ⊆ edges(S), IFT ⋲ R, IFT → Y, F and T
foreign key (N in R), FT discriminator of IFTY
Undirected: enforce symmetry of F and T
Labeled: use X and Y
SyntaxSyntax
Creating graphsCreating graphs
QueryingQuerying
Operations allowed in performOperations allowed in perform
� match, isomorph, subisomorph, similar, k-similar and circuit
� using library clause in BioFlow/Curray to support analysis tools
� Question?
◦ How to implement these operations?
◦ Query optimization?
◦ Selection, projection, join?
� Selection conditions, partial constraints a la IsoKEGG (BIBM 2010)
isomorph, matchisomorph, match
� For computing isomorph, check if the query graph and the data graph have identical “type” of descriptors –restrict unification to full structure
� For match, do not apply term replacement/mapping◦ Question, how do we allow partial unification to support partial isomorphism?� Separate into two groups – no term replacement in one set (BIBM 2010)
Basis Basis –– Minimum Hub CoverMinimum Hub Cover
MHC of GMHC of G
Computational modelComputational model
Similar to deep equality (Abiteboul and den Bussche, DOOD 1995).
SearchSearch
Cliques of order 3 or less of nCliques of order 3 or less of n