Provenance Management for SPARQL Updates Argyro Avgoustaki 1, 2, Giorgos Flouris 2, Irini Fundulaki 2 {argiro, fgeo, fundul}@ics.forth.gr 2 Institute 1 Computer Science Department, University of Crete of Computer Science, FORTH http://www.csd.uoc.gr http://www.ics.forth.gr Motivation Model Features suitable for encoding the triple and attribute level provenance of RDF quadruples uses complex algebraic expressions. represents the provenance of a quadruple as a set of provenance expressions indicating that the quadruple can be produced by different updates based on how and where provenance models supports unions of basic graph patterns In the last few years we have witnessed an explosion in the publication of data in the form of Linked Data. Recording the provenance information of Linked Data is an essential task in order to effectively support trustworthiness, accountability and repeatability. In this context, our work: Introduces a new provenance model for SPARQL INSERT Updates Allows the reconstructability of SPARQL INSERT Updates from their provenance Provides algorithmic support via the Provenance Construction and the Update Reconstruction algorithms Provenance Model quadruple identifier Quadruple (q) Provenance (P) Case s p o n c1 c2 a d b b d a n1 n2 P1 = { (⊥, ⊥, ⊥) } P2 = { ( qp1 .o (c1), qp1 .p (c1), qp1 .s (c1)) } 1 1 1 Attributes derived from const values Attributes derived by ``copying ‘’ values c3 a d b n3 P3 = { ( qp1 .o (c1 {qp 1 .p, qp 1 .o} ⊙ {qp 1 .p, qp1 .o} c2), 1 2 1 2 2 qp11 .o (c1 {qp 1 .p, qp 1 .o} ⊙ {qp 1 .p, qp1 .o} c2), 1 2 1 2 qp 11 .p (c1 {qp 1 .p, qp 1 .o} ⊙ {qp 1 .p, qp1 .o} c2) ) } 1 2 1 2 Attributes derived by ``copying’’ values generated via joins c4 a d b n4 P4 = { ( ( qp1 .s (c3), qp1 .p (c3), qp 1 .o (c3)) ⊕ (( qp 2 .o (c2), qp 2 .s (c2), qp 2 .p (c2)) ) } 1 1 1 1 1 1 Provenance Construction Attributes derived by ``copying’’ values from two different sources Update Reconstruction Provenance Compatible Updates • Quadruple q3: (a, d, b, n3) position INSERT s p o { (?o1, ?o, ?p, WHERE { (?s, (?o, quad pattern identifier ?p, ?o, n3) } n1) . ?p, ?o1, n2) } qi is assigned to identifier ci and has provenance Pi *c1 and c2 result from the evaluation 1 1 of qp1 and qp2 respectively qpins 1 qp 1 qp 21 Dataset --------------{q1, q2} position q3: (a, d, b, n3) INSERT P3: { ( qp 1 (c1* qp 1 qp 1 ⊙ qp 1 qp 1 c2*), { 2 .p , 2 .s } { 2 .o 1 .o } 1 .p , 1 qp 1 .o (c1 { qp 1 .p , qp 11 .o } ⊙ { qp 1 .p , qp21 .s } c2 ), 1 2 1 qp 1 .p (c1 { qp 1 .p , qp 11 .o } ⊙ { qp 1 .p , qp21 .s } c2) ) } 1 Provenance Construction Algorithm O(n) w.r.t the update size O(logn) w.r..t. the Dataset size O(n) w.r.t. the result size Future Work 1. Include FILTER and other non-monotonic operators 2. Study the SPARQL DELETE, CREATE and DROP operations, since all SPARQL operations can be written as a combination of INSERT, DELETE, CREATE and DROP statements 3. Implement our model to experimentally evaluate its performance and scalability for large SPARQL updates and/or updates with a large output. This work was partially supported by the PlanetData NoE (FP7:ICT-2009.3.4, #257641) and DIACHRON IP (FP7:ICT-2011.4.3, #601043) WHERE s { (?x, ?y, { (?w, ?z, (?y, 2 p ?z, o quad pattern identifier ?z, n3) } qpins ?y, n1) . 1 qp 1 ?x, n2) } 1 qp 2 Update Reconstruction Algorithm O(n) w.r.t the update size O(n) w.r..t. the Dataset size References 1. Theoharis, Y., Fundulaki, I., Karvounarakis, G., Christophides, V.: On Provenance of Queries on Semantic Web Data. IEE 15(1), 31-39 (2011) 2. Buneman, P., Cheney, J., Vansummeren, S.: On the Expressiveness of Implicit Provenance in Query and Update Languages. In: Schwentick, T., Suciu, D. (eds.) ICDT. vol. 4353, pp. 209-223. Springer (2007) 3. Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS. pp. 31-40. ACM (2007)
© Copyright 2024 ExpyDoc