FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework Muhammad Saleem 1 , Qaiser Mehmood 2 , Axel-Cyrille Ngonga Ngomo 1 http://feasible.aksw.org/ 1 Agile Knowledge Engineering and Semantic Web (AKSW), University of Leipzig, Germany 2 Insight Center for Data Analytics, National University of Ireland, Galway International Semantic Web Conference, Bethlehem, USA, 2015 07/05/2022 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
05/03/2023 1
FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework
Muhammad Saleem1, Qaiser Mehmood2, Axel-Cyrille Ngonga Ngomo1
http://feasible.aksw.org/1Agile Knowledge Engineering and Semantic Web (AKSW), University of Leipzig, Germany
2Insight Center for Data Analytics, National University of Ireland, GalwayInternational Semantic Web Conference, Bethlehem, USA, 2015
05/03/2023 2
Triple Stores Benchmarks• Synthetic Benchmarks
• Make use of the synthetic queries and/or data• Benchmarks of different data sizes possible• Suitable to test the scalability• Often fail to reflect the reality • For example, LUBM, SP2Bench, BSBM, WatDiv etc.
• Queries Log Benchmarks• Make use of the real queries from queries log• Can be more close to the reality• Can be used with different data sizes• Scalability can be tested• For example, DBPSB, FEASIBLE
05/03/2023 3
DBpedia SPARQL Benchmark• Based on real DBpedia queries log• Benchmarks of different data sizes possible• Suitable to test the scalability• Only Considers SPARQL SELECT• Does not consider Important query features• For example, number of join vertices, triple patterns selectivities• Not customizable for given use cases or needs of an application
05/03/2023 4
FEASIBLE SPARQL Benchmark• Can be applied to any SPARQL queries log• Considers SPARQL SELECT, ASK, DESCRIBE, CONSTRUCT• Considers Important query features• For example, number of join vertices, triple patterns selectivities,
query runtime, resultset size, number of BGPs, Mean join vertices degree, number of triple patterns etc.• Customizable for given use cases or needs of an application
05/03/2023 5
FEASIBLE SPARQL Benchmark
• Dataset cleaning • Feature vectors and normalization• Selection of exemplars • Selection of benchmark queries
05/03/2023 6
Dataset Cleaning • Remove syntactically incorrect queries• Remove zero result size queries• It is an optional step• Not of theoretical necessity• Leads to practically reliable benchmarks
• Sesame Version 2.7.8• Tomcat 7 as HTTP interface and native storage layout.• Set the spoc, posc, opsc indices to those specified in the native storage configuration• The Java heap size was set to 6GB
• Jena-TDB (Fuseki) Version 2.0• Java heap size set to 6GB
• OWLIM-SE Version 6.1• Tomcat 7.0 as HTTP interface• Set the entity index size to 45,000,000 and enabled the predicate list• Rule set was empty and the Java heap size was set to 6GB.
• We configured all triple stores to use 6GB of memory and used default values otherwise.
05/03/2023 30
Comparison of Composite Error
FEASIBLE’s composite error is 54.9% less than DBPSB
05/03/2023 31
Comparison of Triple Stores: QpS
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
SWDF DBpedia
0
50
100
150
200
250
QpS
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
SWDF DBpedia
0
0.5
1
1.5
2
2.5
3
QpS
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
SWDF DBpedia
0
10
20
30
40
50
60
70
QpS
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
Sesa
me
Virt
uoso
OW
LIM
-SE
Fuse
ki
SWDF DBpedia
00.20.40.60.8
11.21.41.61.8
2
QpS
SPARQL ASK SPARQL CONSTRUCT
SPARQL DESCRIBE SPARQL SELECT
05/03/2023 32
Comparison of Triple Stores: Mix Queries
Sesame Virtuoso OWLIM-SE FusekiSWDF
0
5
10
15
20
25
30
35
40
QM
pH
Sesame Virtuoso OWLIM-SE FusekiDBpedia
00.20.40.60.8
11.21.41.61.8
2
QM
pH
Sesame Virtuoso OWLIM-SE FusekiSWDF
00.20.40.60.8
11.21.41.61.8
2
QpS
Sesame Virtuoso OWLIM-SE FusekiDBpedia
00.010.020.030.040.050.060.070.080.09
0.1
QpS
05/03/2023 33
Rank-wise Ranking of Triple StoresAll values are in percentages
• None of the system is sole winner or loser for a particular rank• Virtuoso mostly lies in the higher ranks, i.e., rank 1 and 2 (68.29%)• Fuseki mostly in the middle ranks, i.e., rank 2 and 3 (65.14%)• OWLIM-SE usually on the slower side, i.e., rank 3 and 4 (60.86 %)• Sesame is either fast or slow. Rank 1 (31.71% of the queries) and rank 4 (23.14%)