Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science 03/04/2005 Background • PhD research: Compiler Development Environment (CDE) – Compiler, interpreter, and integrated development environment automatic generation – Several Domain-Specific Languages have been developed on top of CDE • GQL: an application based on CDE – Internet -- Database – Google --Database Management System (DBMS) – GQL -- Structured Query Language (SQL) Google: more than keyword searching • Language preference • File format, date, occurrences, domain • Image, forum, shopping search Query customization in Google • Filling forms • Writing meta-tokens directly – allintext: Xiaoqing Wu filetype:pdf Why GQL (I)? • Forms are not flexible – – – – Fixed Can’t be saved and reused Filling multiple forms is time-consuming Mouse operation is slower than keyboard operation Why GQL (II)? • Meta-tokens are not designed for end-users – – – – Not user friendly No syntax provided No type-checking Ambiguous keyword1 keyword3 OR keyword4 "keyword2" GQL: A well-formed DSL • User friendly grammar – Natural, SQL-like syntax rules, easy to follow – No ambiguity • IDE support – Automatic syntax and type checking • Program based query – Query could be saved and reused – Search from old query • Flexible: numerous forms! No more forms! search {key}* from file where {constraint}* Demo GQL Syntax Grammar [1] query ::= SEARCH|IMAGE o_keylist occurrence constraints withinstmt [2] o_keylist ::= keylist | [3] keylist ::= key | keylist COMMA key [4] key ::= word | noword | orwordlist | exactword [5] word ::= STRING [6] noword ::= NOT word [7] orwordlist ::= orword OR orword | orwordlist OR orword [8] orword ::= word | exactword [9] exactword ::= QSTRING [10] occurrence ::= FROM OCCVALUE | [11] constraints ::= WHERE constraintlist | [12] constraintlist ::= constraint | constraintlist constraint [13] constraint ::= domain | filetype [14] domain ::= indomain | outdomain [15] indomain ::= DOMAIN EQ url [16] outdomain ::= DOMAIN NE url [17] url ::= QSTRING [18] filetype ::= acceptfiletype | rejectfiletype [19] acceptfiletype ::= TYPE EQ TYPEVALUE [20] rejectfiletype ::= TYPE NE TYPEVALUE [21] withinstmt ::= WITHIN QSTRING | GQL IDE structure Googlerecognizable tokens GQL IDE Query Program GQL Compiler Googlerecognizable tokens Googlerecognizable tokens Google Search Engine Query Result Compiler implementation in CDE JLex Specification JLex GQL Specification CUP Specification TLG Compiler CUP Lexer in Java Parser in Java AST Nodes Typechecking in AspectJ Code generation in AspectJ Aspect Weaving GQL Compiler Current status • Basic GQL compiler • IDE supporting multiple document management – Program storage – Editing – Compiling, type-checking and execution • Functionality including all features of Google web & image search • Search within old queries Future work • Extending the grammar to implement all the functionality provided by Google • Adding more strict type-checking for source programs written in GQL • Search result integration. Conclusion • To provide more flexibility in online search, a SQL-like query language is developed in the Google query domain. • Language programs are used to substitute the provided query forms from Google, analogical to SQL and query forms in DBMS, e.g. MS-Access. • The idea could be generalized to other domains, especially in online searching, e.g. airfare searching.
© Copyright 2025 ExpyDoc