Integration of Spatial Information Sources Based on Source Description Framework Yoshiharu Ishikawa, Gihyong Ryu, and Hiroyuki Kitagawa University of Tsukuba Background Spatial information sources: emerging new information sources on the Internet information sources that provide region- or locationoriented information they support mobile users with GPSs and hand-held devices Need for the technology to integrate spatial information sources description of spatial information sources by taking their contents into consideration efficient and effective query planning and processing Popular approach for information integration Wrapper well-known wrapper-mediator approach encapsulates the detail of each information source provides abstract uniform view of the source Mediator selects appropriate information sources for a given query query planning and processing Our Objective and Approaches Objective: development of a spatial information integration framework for location-aware information services to provide useful locationoriented information service to mobile users Our approach (1): development of a description method to represent spatial information sources based on the source description framework describes the contents and the service of the source Our approach (2): development of query planning and processing methods that effectively utilize source descriptions considers the heterogeneity of the underlying information sources effective use of the query processing power of each information source Motivating Example (1) Query received by the mediator: show top-20 nearest restaurants such that within 1000 meters from the current position the evaluation score is more than or equal to 2.5 stars 7 5 6 2 1000m 1 4 3 Information Source A: provides restaurant info for a specific area contains information of restaurants within the rectangle area rectA given name or address, it returns the matched restaurants rectA Motivating Example (2) Information Source B: supports spatial conditions to query restaurant info returns restaurants within the specified circle area (results are ordered by their distances) receives additional condition on restaurant category category = “Chinese” 23 1 4 2 1 53 Information Source C: provides restaurant evaluation scores given restaurant name, it returns the evaluation score select * from Source-C where name = “Manchu” name Manchu score 3.0 Source Description Framework (1) Source Description Framework: a formal framework to specify meta information for an information source proposed in Information Manifold [Levy et al 96] A source description consists of: Contents Description: describes the contents of the source in terms of the global schema Capability Description: describes the types of queries which the source can support Our approach (1): incorporates the notion of spatial data types into the source description framework then represents spatial information spatial queries Our approach (2): allows the specification of top-N query capability in capability descriptions Source Description Framework (2) Data model a global schema is written in the relational data model enhanced with spatial data types a global schema specifies a virtual database: each information source is (partially) mapped into the schema relation Restaurant { relation Evaluation { name string; name string; category string; score real; address string; } location point; } Query language: monoid comprehension [Fegaras&Maier95] a declarative query language an extension of list comprehension (used in functional programming) to multiple collection types (e.g., bag, set) basic form: M{E | Q1, Q2, ..., Qn} M: the collection type of the evaluation result of the form E: allowable expression Qi: generator (with the form v V ) or filter Examples of Source Description Source description for information source A: it provides information of restaurants within the region rectA it can receive name or address as query conditions contents capability description description Source A contents: SA set{r | r Restaurant, in(r.location, rectA)} input/output: < > SA filters: <n: string> name = n, <a: string> address = a Source description for information source B: it receives the query point and the allowable maximal distance it returns ordered results it can receive category as an additional filtering condition Source B contents: SB set{r | r Restaurant} input/output: <q: point, m: real> sorted[d]{x | x SB, d dist(x.location, q), d m} filters: <c: string> category = c sorted result based on distance values Query Processing (1) Example query: retrieve the name and address of restaurants such that within 1000 meters from mypos (the current position of the user) their evaluation scores are larger than or equal to 2.5 stars within the nearest top-20 Step 1: an access target description query is generated: AQ specifies the required information to process query Q AQ = set{r#s | r Restaurant, s Scoring, r.name = s.name, s.score 2.5, dist(r.location, mypos) 1000} Q = head[20]( sorted[d]{<name: r.name, address: r.address> | r Restaurant, s Evaluation, r.name = s.name, s.score 2.5, d dist(r.location, mypos), d 1000}) Step 2: subqueries are extracted: QR and QS QR = set{r | r Restaurant, dist(r.location, mypos) 1000} QS = set{s | s Evaluation, s.score 2.5} AQ = set{r#s | r QR , s QS , r.name = s.name} Query Processing (2) Step 3: target information sources are determined for each subquery Step 4: a query plan is generated for each combination of information sources: for example, For example, source A may contain required information PA, C = set{x#y | x IterC(score 2.5), for QR and becomes the target y IterA(name = x.name), information source if SA QR dist(y.location, mypos) 1000} (SA has appeared in the source description for source Step 5: the final integration A). plan is generated based on This condition is equivalent to: the condition the subqueries over the in(r.location, rectA) information sources (dist(r.location, mypos) 1000) is satisfiable
© Copyright 2024 ExpyDoc