Google XML Reference Google Search Appliance Confidential: For Customer Use Only Revised October, 2005 Google has developed a simple HTTP-based protocol for serving search results. Search administrators have complete control over how search results are requested and presented to the end user. This document describes the technical details of Google search request and results formats. It assumes that the reader has basic understanding of the HTTP protocol and the HTML document format. Contents 1. Overview 2. Request Format 2.1 Request Overview 2.2 Search Parameters 2.3 Query Terms 2.4 Filtering 2.5 Internationalization 2.6 Sorting 2.7 Meta Tags 2.8 Limits 3. Results Format 3.1 Custom HTML 3.1.1 Custom HTML Output Overview 3.1.2 Internationalization 3.2 XML 3.2.1 XML Output Overview 3.2.2 Character Encoding Conventions 3.2.3 Google XML Results DTD 3.2.4 Google XML Tag Definitions Appendices Appendix A: Estimated vs. Actual Number of Results Appendix B: URL Escaping Glossary 1. Overview [TABLE OF CONTENTS] A Google search request is a simple HTTP request to the Google search engine. The search request format and options available are detailed in the Request Format section. The search results are returned in the output format specified in the search request. Currently, Google supports output results in XML and HTML format. XML formatted results give you the power to customize the display of the results through the implementation of a custom XML parser. The HTML results can be customized through the application of an XSL stylesheet to the standard XML results. [TABLE OF CONTENTS] 2. Request Format This section is broken into the following categories: Request Overview Search Parameters Query Terms Filtering Internationalization Sorting Meta Tags Limits 2.1 Request Overview [REQUEST FORMAT] - [TABLE OF CONTENTS] Using the Google search protocol is as simple as requesting a page from a web server. The Google search request is a standard HTTP GET command, which returns results in either XML or HTML format as specified in the search request. The search request is a URL combining the search engine host name, port and path; as well as a collection of name-value pairs (input parameters) separated by & characters. Some examples are listed below. Explanations of input parameters and output results can be found in the remainder of this document. Note: Google recommends performing a HTTP version 1.0 (or later) GET command. Note: To determine which host name and port to send your search requests to, please review your specific configuration documentation. The path to send your search requests to is always "/search". Examples The query GET /search?q=bill+material&output=xml&client=test&site=operations would return the first 10 results matching the query "bill material" in the "operations" collection in the Google XML output format. The query GET /search?q=bill+material&start=10&num=5&output=xml_no_dtd&proxystylesh eet=test&client=test&site=operations would return results numbering 11-15 matching the query "bill material" in the "operations" collection in the Google XML output format. The query GET /search?q=Star+Wars+Episode+%2BI&output=xml_no_dtd&lr=lang_de&ie=lati n1&oe=latin1&client=test&site=movies &proxystylesheet=test would return the first 10 German results matching the query "Star Wars Episode +I" in the "movies" collection returned in the Google custom HTML output format by applying the XSL stylesheet associated with the "test" front end to the standard XML results. 2.2 Search Parameters [REQUEST FORMAT] - [TABLE OF CONTENTS] This table lists all the valid name-value pairs that can be used in a search request and descriptions of how these parameters will modify the search results. Name Description Default Value access Defines whether the user is searching public content or all content (i.e. public and secure). This parameter takes effect only if Secured Content Search capability is enabled. The access parameter can have one of these possible values: p - search public content s - search secure content a - search all content, both public and secure The access parameter defaults to "p" if none is provided. Note: Secured Content Search is automatically enabled for clustered appliances. p Modifies the as_sitesearch parameter as follows: Value i as_dt e Modification Include only results in the web directory specified by as_sitesearch Exclude all results in the web directory specified by as_sitesearch i Adds an additional search query term to search for the phrase specified. This parameter has the same effect as the phrase special query term. Note: New query terms specified will be combined with q query terms to generate search results. Note: The value specified for this parameter must be URL-escaped. Empty string as_eq Adds an additional search query terms to exclude any of the terms specified. This parameter has the same effect as the exclude (-) special query term. Note: New query terms will be combined with q query terms to generate search results. Note: The value specified for this parameter must be URL-escaped. Empty string as_lq Additional search query term to show any pages which link to the specified URL. This parameter has the same effect as the link special query term. Note: No other query terms can be specified when using this special query term. Note: The value specified for this parameter must be URL-escaped. Empty string as_epq as_occt Additional search query term to specify where the search terms occur on the page: anywhere on the page, in the title, or in the URL. Note: Query terms specified will be combined with q query terms to generate search results. Note: The value specified for this parameter must be URL-escaped. Value as_oq as_q Meaning any anywhere on the page title in the title of the page URL in the URL for the page Empty string Adds additional search query terms to find any of the terms specified. This parameter has the same effect as the OR special query term. Note: New query terms will be combined with q query terms to generate search results. Note: The value specified for this parameter must be URL-escaped. Empty string Search query terms as entered by the user. (See Query Terms section for additional query Empty string features.) Note: Query terms specified will be combined with q query terms to generate search results. Note: The value specified for this parameter must be URL-escaped. Additional search query term to show links in the specified web directory or to exclude those links depending on the value of as_dt. This parameter has the same effect as the site special query term. When the Google Search Appliance is sent a search request that includes the as_sitesearch parameter, it converts the value of the parameter into an argument to the site special query term and appends it to the value of q in the search results. For example, if your search contains the following parameters: as_sitesearch q=mycompany&as_sitesearch=www.mycompany.com Empty string The raw XML of your search results will contain the following: <q>mycompany site:www.mycompany.com</q> The default XSLT stylesheet displays the value of the q tag in the search box on the results page. Consequently, using an as_sitesearch parameter will appear to change the user's search query. If the parameter and value as_dt=e are specified, site: is appended to the end of the query term. Note: The value specified for this parameter must be URL-escaped and contain fewer than 125 characters. client A string indicating any valid front end filter Activates or deactivates automatic results filtering performed by Google search. By default, filtering is applied to all Google results returned to improve results quality. (See Automatic Filtering section for more details.) REQUIRED 1 getfields Requests that the names and values of the meta tags specified be returned with each search result, when available. (See Meta Tags section for more details.) Note: All meta tag names or values specified must be double URL-escaped. Empty string ie Input Encoding Sets the character encoding used to interpret the query string. (See Internationalization section for details.) latin1 lr Language restrict Restricts searches to pages in the specified language. Empty string (See Language Restricts section for more details.) num Number of results desired per a single request. The maximum allowable value is 100. (The maximum number of results available for a query is 1,000.) See also start. Note: The actual number of results may be smaller than the requested value. 10 numgm Number of KeyMatch results to return with the results. A value between 0 to 5 (inclusive) can be specified for this option. 3 oe Output Encoding Sets the character encoding used to encode the results returned. (See Internationalization section for details.) UTF8 Select the format of the search results. Valid formats are: Value Output Format xml_no_dtd XML results or custom HTML (See proxystylesheet parameter for details.) output xml REQUIRED XML results with Google DTD reference. If using this value, proxystylesheet must be omitted from the parameters or must be set to an empty string. Restricts the search results to documents with meta tags whose values contain the words or phrases specified. partialfields (See Meta Tags section for more details.) Note: All meta tag names or values specified must be double URL-escaped. Empty string proxycustom Custom XML tags to be included in the XML results. The only permitted values for this parameter are either <HOME/>, <ADVANCED/>, or <TEST/>. (See the Custom HTML output section for more details.) Note: This parameter is disabled if the search request does not contain the proxystylesheet tag. Note: If custom XML is specified, search results will not be returned with the search request. Note: Custom XML must be URL-escaped. Empty string proxyreload A value of 1 indicates that the Google Search Appliance should update the XSL stylesheet cache to 0 refresh the stylesheet currently being requested. This parameter is optional. The XSL stylesheet cache is updated approximately every 15 minutes. (See the Custom HTML section for more details.) If the value of the output parameter is xml_no_dtd, then the output format is modified by the proxystylesheet value as follows: Proxystylesheet Value proxystylesheet Output Format Omitted XML results Empty XML results have a contenttype of text/html (rather than text/xml), because the XML results are not transformed. Custom HTML results through application of the XSL Front End Name stylesheet associated with the specified front end NA (See the Custom HTML section for more details.) Note: This parameter may also specify the identifier of a valid collection. The default XSL stylesheet associated with that collection will then be used for custom HTML output. Note: The value specified for this parameter must be URL-escaped. q Search query as entered by the user. (See Query Terms section for additional query features.) Note: The value specified for this parameter must be URL-escaped. Restricts the search results to documents that contain exact meta tag names or name-value pairs specified. requiredfields (See Meta Tags section for more details.) Note: All meta tag names or values specified must be double URL-escaped. site sitesearch Empty string Empty string The name of a collection. Note that you can search over multiple collections using the properly escaped OR (pipe REQUIRED character) to separate the collection names. Additional search query term to show links in the specified web directory. Requires that a value for q (query) be submitted as well. (The value of as_dt does not modify the effect of the sitesearch parameter.) This parameter has the same effect as the site special query term. Note: The sitesearch and as_sitesearch parameters Empty string differ in how they are returned in the XML results. The sitesearch parameter is not appended to the search query in the results. That is, the original query term will not be modified when you use the sitesearch parameter. Note: The value specified for this parameter must be URL-escaped and contain fewer than 125 characters. sort start Indicates alternate sorting method. (See Sorting section for sort parameter format and details.) Note: Only date sort is currently supported. Empty string Use this parameter to support result set page navigation. The maximum number of results available for a query is 1,000, i.e., the value of the start parameter added to the value of the num parameter cannot exceed 1,000. See also num. 0 Custom Parameters If any custom parameters that contain spaces are added to the search URL, the space characters will be replaced by an underscore (_). For example: http://search.customer.com/search?q=customer+query&site=collection&cl ient=collection&output=xml_no_dtd&newparam=test+this This URL adds the custom parameter newparam with a value of "test+this." For security reasons, all space characters (represented as a "+") in the custom parameter newparam will be replaced by "_" characters, while built-in variables, such as q, will not be affected. The resulting XML will look like this: <PARAM name="q" value="customer query" original_value="customer+query"/> <PARAM name="newvar" value="test_this" original_value="test+this" /> The unmodified value can still be retrieved from the original_value attribute. 2.3 Query Terms [REQUEST FORMAT] - [TABLE OF CONTENTS] Default Search By default, Google only returns pages that include all of your search terms. There is no need to include "AND" between terms. Keep in mind that the order in which the terms are typed will affect the search results. To restrict a search further, just include more terms. Google ignores common words and characters such as "where" and "how," as well as certain single digits and single letters, because they tend to slow down your search without improving the results. Google will indicate if a common word has been excluded by including text in the search comments field of the search results returned. Special Characters By default, all non-alphanumeric characters that are included in a search query are treated as query term separators (just like space characters). The exceptions to this rule are the following characters: double quote mark ("), plus sign (+), minus sign (hyphen) (-), decimal point (.), and ampersand (&). The ampersand character (&) is treated as another character in the query term in which it is included. The decimal point is a query term separator unless it is part of a number (e.g., 250.01), in which case it counts as part of the query term. The remaining exception characters correspond to search features listed in the section below. If your document contains a number, with or without a decimal point, that has letters immediately before or after it, the letters are treated as a separate word or words. For example, the string 802.11a is indexed as two separate words, 802.11 and a. Special Query Terms Google supports the use of several special query terms that allow the user or search administrator to access additional capabilities of the Google search engine. These special query terms are listed below. Note: All query terms must be correctly URL-escaped in the search request sent to Google search. Special Query Capability Exclude Query Term Sample Usage bass -music Description Sometimes what you're searching for has more than one meaning. For example, the term "bass" can refer to either fishing or music. You can exclude a word from your search by putting a minus sign ("") immediately in front of the term you want to exclude from the search results. Note: The search request parameter, as_eq, can also be used to submit terms to exclude. Phrase Search "yellow pages" Search for complete phrases by enclosing them in quotation marks or connecting them with hyphens. Words marked in this way will appear together in all results exactly as you have entered them. Phrase searches are especially useful when searching for famous sayings or proper names. Note: The search request parameter, as_epq, can also be used to submit a phrase search. Boolean OR Search vacation london OR paris Google search supports the Boolean "OR" operator. To retrieve pages that include either word A or word B, use an uppercase OR between terms. Note: The search request parameter, as_oq, can also be used to submit a search for any term in a set of terms. Domain search examples: site:www.google.com site:google.com site:com Directory Restricted Directory search examples: Search admission site:www.stanford.edu/group/uga site:www.google.com/about/ site:www.google.com/about To search a domain, specify a partial string that matches complete name segments from the end of the canonical host name. To search a particular directory on a web server (including root), you must specify the complete canonical name of the host server followed by the path of the directory. The string must have a "/" character after the host name to limit searches to a single server/directory. The path segments searched must be a complete match, because there is no partial path segment matching. Enter the query followed by "site:" followed by the host name and path of the web directory. If the ("/") character is at the end of the web directory path specified, then only files within that directory will be searched and files in sub-directories will not be considered. The URLs for these queries must contain fewer than 119 characters. Note: The exclusion operator ("-") can be applied to this query term to remove a web directory from consideration in the search. Note: Only one "site:" search term per search request is supported at this time. Note: The search request parameters, as_sitesearch and as_dt, can also be used to submit "site:" and "site:" search terms. Title intitle:Google search If you prepend "intitle:" to a query term, Google search will restrict the results to documents containing that word in the title. The query term must appear in the first 10 words of the title. Note there can be no space between the "intitle:" and the following word. Search (term) Note: Putting "intitle:" in front of every word in your query is equivalent to putting "allintitle:" at the front of your query. Title Search (all) allintitle: Google search If you start a query with the term, "allintitle:"; Google search will restrict the results to those with all of the query words in the title. The query terms must appear in the first 10 words of the title. If you prepend "inurl:" to a query term, Google search will restrict the results to documents containing that word in the result URL. Note there can be no space between the "inurl:" and the following word. URL Search (term) inurl:Google search Note: "inurl:" works only on words, not URL components. In particular, it ignores punctuation and will only use the first word following the "inurl:" operator. To find multiple words in a result URL, use the "inurl:" operator for each word. Note: Putting "inurl:" in front of every word in your query is equivalent to putting "allinurl:" at the front of your query. If you start a query with the term, "allinurl:"; Google search will restrict the results to those with all of the query words in the result URL. URL Search (all) File Type Filtering allinurl: Google search Google filetype:doc OR filetype:pdf Note: "allinurl:" works only on words, not URL components. In particular, it ignores punctuation. Thus, "allinurl: foo/bar" will restrict the results to page with the words "foo" and "bar" in the URL, but won't require that they be separated by a slash within that URL, that they be adjacent, or that they be in that particular word order. There is currently no way to enforce these constraints. The query prefix, "filetype:", will filter the results returned to only include documents with the extension specified immediately after. Note there can be no space between "filetype:" and the specified extension. Note: Multiple file types can be included in a filtered search by adding more "filetype:" terms to the search query, when used in conjunction with the Boolean OR. File Type Google -filetype:doc Exclusion -filetype:pdf The query prefix, "filetype:", will filter the results to exclude documents with the extension specified immediately after. Note there can be no space between "-filetype:" and the specified extension. Note: Multiple file types can be excluded in a filtered search by adding more "-filetype:" terms to the search query. Web Document info:www.google.com Info The query prefix, "info:", will return a single result for the specified URL if it exists in the index. Note: No other query terms can be specified when using this special query term. The query prefix, "link:", will list web pages that have links to the specified web page. Note there can be no space between "link:" and the web page URL. Back Links link:www.google.com Note: No other query terms can be specified when using this special query term. Note: The search request parameter, as_lq, can also be used to submit a link: request. The query prefix, "cache:", will return the cached HTML version of the specified web document that the Google search crawled. Note there can be no space between "cache:" and the web page URL. Cached Results Page cache:www.google.com web 2.4 Filtering If you include other words in the query, Google will highlight those words within the cached document. Note: To use Google's default cached result display, simply omit the output parameter in the cache request. To customize the display of cached results, simply request XML or Custom HTML output as part of the cache request and ensure your parser or stylesheet will handle the incoming cache data. [REQUEST FORMAT] - [TABLE OF CONTENTS] Google search provides many ways for you to filter the results that are returned as part of your query. These filtering options include: Automatic Filtering Language Filters o Automatic Language Filters o Combining Language Filters Other filtering options can be applied through special query parameters, query terms and meta tags, which are documented in their respective sections. Please review these sections for more information on other filtering options. 2.4.1 Automatic Filtering The quality of the results Google returns for searches is extremely important. One method that makes sure the best results are returned for a query is automatic "filtering" of the search results to weed out undesirable results. Currently, Google search uses two techniques for automatic filtering of results: Duplicate Snippet Filter - If multiple documents contain the same information in their snippets in response to a query, then only the most relevant document of that set will be displayed in the results. Duplicate Directory Filter - If there are many results in a single web directory, then only the two most relevant results for that directory will be returned in the results. An output flag indicates that more results are available from that directory. By default, both types of filters are enabled. However, you can disable them with the filter parameter. Setting filter=1 enables both Duplicate Directory Filtering and Duplicate Snippet Filtering. This is the default setting if no value for the filter parameter is provided. Setting filter=0 will disable both Duplicate Directory Filtering and Duplicate Snippet Filtering. Although determining when to use this option is up to each search administrator, Google recommends against setting filter=0 for typical search requests, since Google has found that document filtering significantly enhances the quality of most search results. Setting filter=p will disable Duplicate Snippet Filtering only. Setting filter=s will disable Duplicate Directory Filtering only. When an end user submits a search request in which filtering removes any results, the removal of the results will be noted in the output generated for the search results. See the section on Estimated vs. Actual Number of Results for more information on how a filtered result set is identified and recommendations for results display. The appliance also will automatically group results from a single directory in the search results. If you set filter=0, then the order in which results are ranked can change depending on the value of the num parameter. For example, if you set num=10 and filter=0 you may get two results in a particular directory that are considered in the 10 most relevant results. If one of these results is the most relevant of all, then directory crowding will cause both be displayed at the top of the results. If you now set num=20, you may get a third result in the same directory that would be ranked from between 11 and 20. However, this result will actually be ranked third because of directory crowding. 2.4.2 Language Filters This section covers: Automatic Language Filters Combining Language Filters 2.4.2.1 Automatic Language Filters Language filters limit searches to pages in the specified languages. The algorithm for automatically determining the language of a web document is not customizable. The language determination algorithm is primarily based on the majority language used in the web document body text. Automatic language collections may not be appropriate for all users. Note: Encoding schemes for input and output of search requests are important when providing international search. Please review the Internationalization section for more details. The automatic language filters generated are: Language Automatic Language Filter Name Arabic lang_ar Chinese (Simplified) lang_zh-CN Chinese (Traditional) lang_zh-TW Czech lang_cs Danish lang_da Dutch lang_nl English lang_en Estonian lang_et Finnish lang_fi French lang_fr German lang_de Greek lang_el Hebrew lang_iw Hungarian lang_hu Icelandic lang_is Italian lang_it Japanese lang_ja Korean lang_ko Latvian lang_lv Lithuanian lang_lt Norwegian lang_no Portuguese lang_pt Polish lang_pl Romanian lang_ro Russian lang_ru Spanish lang_es Swedish lang_sv Turkish lang_tu 2.4.2.2 Combining Language Filters Search requests that use the lr parameter support the Boolean operators identified in the table below (in order of precedence). Boolean Operator Boolean NOT [ -] Sample Usage Description -lang_fr Removes all results that are defined as part of the Language Filter immediately following the "" operator. The example lr value would remove all results in French. Boolean AND [ .] Returns results that are in the intersection of the results returned by the collection to either side of the "." operator. gloves.hats The example restrict value would return all results which are in both the "hats" and "gloves" custom collections. Boolean OR [ | ] Returns results that are in either of the results returned by the collection to either side of the "|" operator. lang_en|lang_fr (gloves).(Parentheses [ ( (lang_hu|lang_cs)) )] The example lr value would return all results matching the query that are in either French or English. All terms within the innermost set of parentheses will be evaluated before terms outside the parentheses are evaluated. Use parentheses to adjust the order of term evaluation. The example lr value would return all results in the "gloves" custom collection that are not in either the Hungarian or Czech collections. Note: Spaces are not valid characters in the collection string. 2.5 Internationalization [REQUEST FORMAT] - [TABLE OF CONTENTS] In order to support searching documents in multiple languages and character encodings, Google provides the ie parameter to specify how Google search should interpret characters in the search request, and the oe parameter to specify how characters in the search results output should be encoded. To appropriately decode the search query and correctly encode the search results, specify the correct ie and oe parameters, respectively, in the search request. Note: When providing search for multiple languages, Google recommends the usage of the utf8 encoding value for the ie and oe parameters. Example The query GET /search?q=gloves&client=test&site=test&lr=lang_en|lang_fr&ie=latin1&o e=latin1 would interpret the search query "gloves" using the latin1 encoding scheme, search for English or French results, and return results in the latin1 encoding scheme. The query GET /search?q=gloves&client=test&site=test&lr=(-lang_hu).(- lang_cs)&ie=latin2&oe=latin2 would interpret the search query "gloves" using the latin2 encoding scheme, search for any results which are not in Hungarian or Czech, and return results in the latin2 encoding scheme. The query GET /search?q=gloves&client=test&site=test&lr=lang_zh-CN|lang_zhTW&ie=utf8&oe=utf8 would interpret the search query "gloves" using the utf8 encoding scheme, search for any results which are in Simplified or Traditional Chinese, and return results in the utf8 encoding scheme. Note: See the Language Filters section for details of language specific searches using the lr parameter. Character Encoding Values The following table lists all encoding values supported by these parameters: Language Encoding Value Alternate Encoding Value Chinese (Simplified) gb GB2312 Chinese (Traditional) big5 Big5 Czech latin2 ISO-8859-2 Danish latin1 ISO-8859-1 Dutch latin1 ISO-8859-1 English latin1 ISO-8859-1 Estonian latin4 ISO-8859-4 Finnish latin1 ISO-8859-1 French latin1 ISO-8859-1 German latin1 ISO-8859-1 Greek greek ISO-8859-7 Hebrew hebrew ISO-8859-8 Hungarian latin2 ISO-8859-2 Icelandic latin1 ISO-8859-1 Italian latin1 ISO-8859-1 Japanese sjis Shift_JIS Korean euc-kr EUC-KR Latvian latin4 ISO-8859-4 Lithuanian latin4 ISO-8859-4 Norwegian latin1 ISO-8859-1 Portuguese latin1 ISO-8859-1 Polish latin2 ISO-8859-2 Romanian latin2 ISO-8859-2 Russian cyrillic ISO-8859-5 Spanish latin1 ISO-8859-1 Swedish latin1 ISO-8859-1 latin3 ISO-8859-3 latin5 ISO-8859-9 latin6 ISO-8859-10 euc-jp EUC-JP utf8 UTF-8 Unicode (All Languages) 2.6 Sorting [REQUEST FORMAT] - [TABLE OF CONTENTS] Google search provides two sorting options for implementing your search solution: Sort By Relevance Sort By Date 2.6.1 Sort By Relevance (Default) By default, Google combines hypertext analysis and PageRank technologies to provide users with highly relevant results. Hypertext analysis uses the design of the page, examining over 100 factors to determine the best result for your query term. PageRank considers the link structure of the entire index to understand how each page links to the other pages in the index. 2.6.2 Sort By Date Google search also supports the ability to order search results by date. The date of a web document is defined by parameters configured by the search administrator. When a search is performed using the sort by date capability, the date associated with each result document will be included with the results. When using the Sort by Date feature, the automatic quality filter will sometimes reorder results when performing result grouping. This can be disabled by adding the "filter =0" parameter to the search request when performing search by date. Example The query GET /search?q=chicken+teriyaki&output=xml&client=test&site=test&sort=date :D:S:d1 would return the first 10 top results sorted by both date and relevancy which match the query "chicken teriyaki" in the "test" collection. Details To sort the results by date, the sort parameter must be formatted as follows: date:<direction>:<mode>:<format> where <direction>, <mode> and <format> can have the following values: <direction> Value Results A Sort results in ascending date order D Sort results in descending date order <mode> Value Results S Sort relevant results. Google's algorithm will determine a subset of the most relevant results from the set of all results, and then sort that subset by date to return as results for the search request. R Sort all results Note: Providing sort by date on queries with large result sets may incur performance penalties. L Perform a look-up on the date associated with each document and return the date information for each result returned; but no sorting is performed. <format> Value d1 2.7 Meta Tags Results The format of the value returned for each search result returned is set to YYYY-MM-DD [REQUEST FORMAT] - [TABLE OF CONTENTS] Google search provides two options for leveraging the meta tags that are available in your content. Unless one of these parameters is specified; meta tag information will not be considered in your search results, since that information is not visible to the search user. These options are: Requesting Meta Tag Values Filtering by Meta Tags 2.7.1 Requesting Meta Tag Values Through the use of the getfields parameter, the Google search engine allows a search request to specify meta tag values to return with the search results. The search engine will only return meta tag information for results which actually contain the meta tags. The search for meta tags is case-insensitive. Use only whole words in the getfields parameter, not partial words or word "stems." There is a limit of 320 characters returned for each meta tag when using getfields. This character limit includes the meta tag name and content. Usage GET /search?q=[search term]&output=xml&client=test&site=test&getfields=[meta tag name] Example The query GET /search?q=books&output=xml&client=[test]&site=[test]&getfields=author .title.keywords would return the first 10 results that match the query "books" in the "test" collection. If any of the results contain the author, title and/or keywords meta tags, then the values of those meta tags will be returned with the respective results. For example, the following tags could be returned with this search request: <META NAME="author" CONTENT="Jakob Nielsen"> <META NAME="title" CONTENT="Usability Engineering"> <META NAME="keywords" CONTENT="Usability, User Interface, User Feedback"> Details To specify multiple meta tag values to be returned, list all meta tag names separated by a period (".") as in the example above. To request all available meta tags for each search result, specify an asterisk ("*") as the value for the getfields parameter. Note: When meta tag values are requested, they are not displayed in results requested in the default HTML format. Please use the custom HTML or XML output options to take advantage of this feature. Note: All meta tag names or values specified must be double URL-escaped. See an example in the following section. 2.7.2 Filtering by Meta Tags The Google search engine can filter results by the values of the result meta tags. This section details how to use the requiredfields and partialfields input parameters to filter on meta tag values. The term partialfields refers to part of the meta tag content, rather than part of a word. Other filtering techniques are noted in the Filtering section. Usage GET /search?q=[search term]&output=xml&client=test&site=test&requiredfields=[metatag name]:[metatag content] Examples The query GET /search?q=checks&output=xml&client=test&site=test&requiredfields=depa rtment:Human%252BResources|department:Finance returns the first 10 results which match the query "checks" in the "test" collection which also contained either of the following meta tags: <META NAME="department" CONTENT="Human Resources"> <META NAME="department" CONTENT="Finance"> The query GET /search?q=books&output=xml&client=test&site=test&partialfields=author :Scott would return the first 10 results which match the query "books" in the "test" collection which also contained the word "Scott" somewhere in the "author" meta tag. Some example meta tags satisfying this search request are: <META NAME="author" CONTENT="Sir Walter Scott"> <META NAME="author" CONTENT="F. Scott Fitzgerald"> Details Multiple meta tag constraints can be specified using the requiredfields and partialfields parameters. To filter for the presence of a meta tag, indicate the name of the meta tag to be found. To filter on a specific meta tag value, indicate the name of the meta tag followed by the colon ":" character and then the specific value. The partialfields parameter matches complete words, not parts of words. In addition, the match must be within the first 160 characters of the meta tag. See the examples in the table below for sample usage. To combine multiple name-value pairs, use the following operators: Boolean Operator Sample Usage Description Boolean AND [ . ] author:William.keywords Returns results which satisfy both meta tag constraints. Returns results which Boolean department:Sales|department:Finance satisfy either meta OR [ | ] tag constraint. As stated in the "Query Terms" section, all non-alphanumeric characters included in a search query are treated as query term separators (just like space characters). Similarly, Google uses these separators to divide metatag content into single entities, or word tokens; that is, a word or a string that may or may not be a real word. The separators, used in both queries and results, and their values are in the table. They are not customizable. Separator + { ~ } ! | @ ` # [ $ ] % Value ^ : ; & ' * < ( > ) ? , . / space = character \ 92 " 34 \t 9 \r 13 \n 10 \v 11 \f 12 \177 177 Note: All meta tag names or values specified must be double URL-escaped. See example above. [REQUEST FORMAT] - [TABLE OF CONTENTS] 2.8 Limits This section lists any limitations on the search requests sent to Google search. Component Limit Search request length 2048 bytes Query Terms (includes terms in parameter q and any parameters starting with as_ ) 50 site: query terms (includes use of as_sitesearch 1 (per search request) parameter) [TABLE OF CONTENTS] 3. Results Format This section is broken into the following categories: Custom HTML XML 3.1 Custom HTML [RESULTS FORMAT] - [TABLE OF CONTENTS] The description of the custom HTML results section is broken down into the following sections: Custom HTML Output Overview Internationalization 3.1.1 Custom HTML Output Overview [CUSTOM HTML] - [RESULTS FORMAT] - [TABLE OF CONTENTS] Google search provides the ability to generate custom HTML by incorporating an XSLT (eXtensible Stylesheet Language Transformation) server into the search engine infrastructure. Search requests submitted to the Google search engine, with the output input parameter set to xml_no_dtd and a valid proxystylesheet parameter value, will automatically be processed by the XSLT server as requests for custom HTML output. Using the XSL stylesheet specified by the proxystylesheet parameter; the XSLT server will apply the transformation rules found in the XSL stylesheet to the standard Google XML results and return the resulting output. While this document assumes that the output generated by applying the XSL stylesheet will be HTML, almost any output format can be generated by the application of the appropriate XSL stylesheet rules. For any front end, the default XSL stylesheet can be customized or replaced by the search administrator. To customize the XSL stylesheet used to generate custom HTML output, please review Google's XML output format to determine the XML tags that may be transformed using a customized XSL stylesheet. Additionally, you can leverage the proxycustom parameter to pass custom XML tags to the XSLT server. Since the inclusion of custom XML does not generate search results, this feature is useful for implementing additional static search pages, such as an advanced search page. Notes: XSL stylesheets used by the XSLT server will be cached for 15 minutes. To force the XSLT server to use the latest version of an XSL stylesheet, set the proxyreload input parameter to a value of 1 in your search request. XSL stylesheets which include other files may not be used with the Google search engine. Any XSL stylesheet which contains the following tags / functions will generate an error result: <xsl:import>, <xsl:include>, xmlns: and document() When requesting cached results in custom HTML output, the BLOB XML tag and associated value are automatically converted to the original text before the XSL stylesheet rules are applied. When using an XSL stylesheet which customizes cache results, simply use the values of the CACHE_LEGEND_TEXT, CACHE_LEGEND_NOTFOUND and CACHE_LEGEND_HTML XML tags directly instead of applying a rule on the BLOB sub-tag. If you use input or output encodings other than latin1, please consult the Internationalization section for more details. More information on XSL and XSLT can be found on the W3C web site. 3.1.2 Internationalization [CUSTOM HTML] - [RESULTS FORMAT] - [TABLE OF CONTENTS] The Google search engine handles over 20 character encoding schemes. This section will discuss any special considerations that must be made when using the custom HTML output format with encoding schemes other than latin1. In order to support all the encoding schemes supported by Google, the XSLT server follows a process to ensure that the results are returned in the correct encoding scheme. When requesting search results through the XSLT server, the server will translate the results to the UTF8 encoding scheme before applying the selected XSL stylesheet. Once the XSL stylesheet rules are applied to generate the results, then the results will be converted to the encoding scheme specified in the output encoding parameter, oe, of the search request. The one exception to this rule is cached result pages, which get converted to the encoding scheme of the cached document after XSLT processing. Note: XSL stylesheets are associated with a front end. All XSL stylesheets must be in latin1 or UTF8 formats. 3.2 XML [RESULTS FORMAT] - [TABLE OF CONTENTS] The description of the XML results format is broken down into the following sections: XML Output Overview Character Encoding Conventions Google XML Results DTD Google XML Tag Definitions 3.2.1 XML Output Overview [XML] - [RESULTS FORMAT] - [TABLE OF CONTENTS] For maximum flexibility, Google provides search results in XML format. Using the Google XML results, you can use your own XML parser to customize the display for your search users. For developers who want to specify an XSL stylesheet for transformation of the XML results, instead of developing their own XML parser, proceed to the Custom HTML section. Note: All element values will be valid HTML suitable for display, unless otherwise noted in the XML tag definitions. Some values are URLs which will need to be HTML encoded before displaying. All XML parsers used to parse Google results should be built to ignore any attributes or tags which are not documented. This will allow custom XML parsers to continue working without modification when Google adds more features to the XML output in the future. In any custom parameters added that contain spaces, each space will be replaced with "_". You can still retrieve the unmodified value from "original_value." For example: <PARAM name="temp" value="token_ring" original_value="token+ring" /> 3.2.2 Character Encoding Conventions [XML] - [RESULTS FORMAT] - [TABLE OF CONTENTS] The first line of the Google XML results will indicate which character encoding is used. See the XML Standard for more details. Additionally, certain characters are required to be escaped when included as values in XML tags. These characters are documented in the XML standard, and are also reproduced in the table below. All other characters in the XML results will be presented without modification. Character Escaped form < either &lt; or &#60; & either &amp; or &#38; > either &gt; or &#62; ' either &apos; or &#39; " either &quot; or &#34; 3.2.3 Google XML Results DTD [XML] - [RESULTS FORMAT] - [TABLE OF CONTENTS] Google XML results can be returned either with or without a reference to the most recent DTD (Document Type Definition) describing Google's XML format. The DTD is a guide to help search administrators and XML parsers understand the XML results output. Since Google's XML grammar may change from time to time, you should not configure your parser to use the DTD to validate the XML results. Additionally, XML parsers should not be configured to fetch the DTD every time a search request is performed. Since the DTD is updated infrequently, these fetches create unnecessary delay and bandwidth requirements. Google recommends that you use the xml_no_dtd output format to get XML results. If you specify the xml output format in your search request, then the only difference will be the inclusion of the following line in the XML results. <!DOCTYPE GSP SYSTEM "google.dtd"> The DTD is available on the Google Search Appliance at http://<appliance_hostname>/google.dtd If there are other features you would like to see on the DTD, please consult with your account representative. Not all features in the DTD may be available or supported at this time. 3.2.4 Google XML Tag Definitions [XML] - [RESULTS FORMAT] - [TABLE OF CONTENTS] This section provides an index and details of Google's XML results. Sub-Tags Legend ? * + | = = = = optional sub-tag zero or more instances of the sub-tag one or more instances of the sub-tag Boolean OR Index The XML tags are listed in alphabetical order below. Please click on the first letter of the XML tag in question to jump to the correct section. B C F G H L M N O P Q R S T U Details BLOB Format Text (See Definition) Sub-Tags Definition This tag contains HTML data in the encoding format specified in the attribute. Additionally, the data has been BASE64 encoded to preserve data integrity of cached results encoded in a different encoding scheme then the results requested. Name Attributes Format Description The encoding scheme of the Text HTML data encoding (Encoding (See the Internationalization Scheme) section for a list of common encoding values) C Format Sub-Tags Definition Indicates that the "cache:" special query term is supported for this search result URL Name Attributes SZ Format Description Provides the size of the cached version of the search result in kilobytes ("k"). This field is Text not populated if no cached (Integer + version of a document is "k") available, which can be the case if robots noarchive metatags are used. X CID Text Identifier of a document in Google's cache. To fetch the document from the cache, send a search term built like this: "cache:" + CID text + ":" + escaped URL. The escaped URL is available in the UE tag. Send this search term normally, as one would type it into the search form. CACHE Format Sub-Tags CACHE_URL, CACHE_REDIR_URL, CACHE_LAST_MODIFIED, CACHE_LEGEND_FOUND?, CACHE_LEGEND_NOTFOUND?, CACHE_CONTENT_TYPE, CACHE_LANGUAGE, CACHE_ENCODING, CACHE_HTML Definition Provides encapsulation for the cached version of a search result Attributes CACHE_CONTENT_TYPE Format Text (MIME type) Sub-Tags Definition MIME type of the cached result as specified in the HTTP header returned when the document was crawled Attributes CACHE_ENCODING Format Text Sub-Tags Definition Attributes The encoding scheme of the cached result as specified in the HTTP header returned when the document was crawled (See the Internationalization section for a list of common values) CACHE_HTML Format Text (HTML) (Custom HTML output only) Sub-Tags BLOB? Definition The cached version of the search result. All search results are stored in HTML format after being translated for indexing. (XML output only) Attributes CACHE_LANGUAGE Format Text (Google language tag) Sub-Tags Definition The language of the cached result as determined by Google's automatic language classification algorithm. The value of this tag is the same as the values used for the automatic language collections without the "lang_" prefix. Attributes CACHE_LAST_MODIFIED Format Text Sub-Tags Definition Date that the document was crawled, as specified in the Date HTTP header when the document was crawled for this index. The crawler will fetch documents from its cache if the web server responds with a 304 (not modified) status code to an ifmodified-since request. In this case, the CACHE_LAST_MODIFIED will be the date the document was originally crawled and not the date of the if-modified-since request. Attributes CACHE_LEGEND_FOUND Format Sub-Tags CACHE_LEGEND_TEXT* Definition Provides encapsulation for query terms found in the visible text of the cached result returned Attributes CACHE_LEGEND_NOTFOUND Format Text (Custom HTML output only) Sub-Tags BLOB? Definition Details of any query terms not visible in the cached result returned (XML output only) Attributes CACHE_LEGEND_TEXT Format Text (Custom HTML output only) Sub-Tags BLOB Definition Details of a query term which is visible in the cached result. Any query terms found in the cached result will automatically be highlighted using the colors described in the attributes of this tag. (XML output only) Name fgcolor Format Color attribute The foreground color of the query term in the cached result. This value can be used directly in a color attribute for HTML tags. Color attribute The background color of the query term in the cached result. This value can be used directly in a color attribute for HTML tags. Attributes bgcolor Description CACHE_REDIR_URL Format Text (Absolute URL) Sub-Tags Definition Attributes Final URL of cached result after all redirects are resolved CACHE_URL Format Text (Absolute URL) Sub-Tags Definition Initial URL of cached result Attributes CRAWLDATE Format Text Sub-Tags Definition This is an optional element that shows the date that the page was crawled. It is shown only for pages crawled within the past two days. Attributes CT Format HTML Sub-Tags Definition Search comments Example comment: Sorry, no content found for this URL Attributes CUSTOM Format Sub-Tags (Any custom XML specified in the search request) Definition Provides encapsulation for any custom XML tags specified in the proxycustom input parameter Attributes FI Format Sub-Tags Definition Indicates that document filtering was performed during this search Note: See the section on Automatic Filtering for more details Attributes FS Format Sub-Tags Definition Additional search result details Name Attributes Format Description NAME Text Name of the result descriptor VALUE Text Value of the result descriptor GSP Format Sub-Tags (TM, Q, PARAM*, CUSTOM?, Spelling?, Synonyms?, CT?, TT?, GM*, RES?) | CACHE Definition GSP = "Google Search Protocol" Provides an encapsulation for all data returned in the Google XML search results Name Attributes VER Format Text Description Indicates version of the search results output. The current output version is "3.2". GD Format Text (HTML) Sub-Tags Definition Contains the description of a KeyMatch result Attributes GL Format Text (URL) Sub-Tags Definition Contains the URL of a KeyMatch result Attributes GM Format Sub-Tags GL, GD? Definition Provides encapsulation for a single KeyMatch result Attributes HAS Format Sub-Tags L?, C? Definition Provides encapsulation for any special features supported for this search request Attributes HN Format Text (URL-escaped web directory) Sub-Tags Definition Indicates that directory crowding has occurred and that additional results are available from the directory where this search result was found. The value of this tag is ready to be used with the "site:" query term. Name Attributes U Format Text Description HTML version of web directory L Format Sub-Tags Definition Indicates that the "link:" special query term is supported for this search result URL Attributes M Format Text (Integer) Sub-Tags Definition The estimated total number of results for the search Note: The estimate of the total number of results for a search can be too high or too low. Please review the appendix entitled, Estimated vs. Actual Number of Results. Attributes MT Format Sub-Tags Definition Meta tag name and value pairs pulled from the search result Note: Only meta tags which are requested in the search request will be returned Name Attributes Format Description N Text Name of the meta tag V Text Value of the meta tag NB Format Sub-Tags PU?, NU? Definition Provides encapsulation for result set navigation information Note: The NB tag will only be present if either previous or additional results are available Attributes NU Format Text (Relative URL) Sub-Tags Definition Contains relative URL to the next results page Note: The NU tag will only be present if additional results are available Attributes OneSynonym Format HTML Sub-Tags Definition A synonym suggestion for the submitted query in HTML format. Name Attributes Q Format Description The URL-escaped version of the synonym suggestion Text PARAM Format Sub-Tags Definition The input parameters submitted to the Google search engine to generate these results Name Attributes Format Description name Text value HTML formatted version HTML of the input parameter value original_value Text Input parameter name Original URL-escaped version of the input parameter value PU Format Text (Relative URL) Sub-Tags Definition Attributes Contains relative URL to the previous results page Note: The PU tag will only be present if previous results are available Q Format HTML Sub-Tags Definition The search query submitted to the Google search engine to generate these results Attributes R Format Sub-Tags U, T?, RK, FS?, MT*, S?, HAS, HN? Definition Provides encapsulation for the details of an individual search result Name Attributes Format Description N Text Indicates the index (1-based) (Integer) of this search result L Indicates the recommended indentation level of the results. Note: Currently this value will Text always be 1 unless directory (Integer) crowding occurs. In this case, the second directory result will have a value of 2. MIME Text Indicates the MIME type of the search result RES Format Sub-Tags M, FI?, XT?, NB?, R* Definition Provides encapsulation for the details of the individual search results Name Format Description SN Indicates the index (1-based) Text of the first search result (Integer) returned in this result set EN Indicates the index (1-based) Text of the last search result (Integer) returned in this result set Attributes RK Format Text (Integer in the range 0-10) Sub-Tags Definition Provides a general rating of the relevance of the search result Attributes S Format Text (HTML) Sub-Tags Definition Search result snippet for the search result Note: Query terms will be in highlighted in bold in the results, and line breaks will be included for proper text wrapping. Attributes Spelling Format Sub-Tags Suggestion+ Definition Provides encapsulation for alternate spelling suggestions for the submitted query. Only one spelling suggestion is returned at this time. Attributes Suggestion Format HTML Sub-Tags Definition An alternate spelling suggestion for the submitted query in HTML format Name Attributes Q Format Text Description The URL-escaped version of the spelling suggestion Synonyms Format Sub-Tags OneSynonym+ Definition Provides encapsulation for synonym suggestions for the submitted query. Up to 20 synonym suggestions may be returned depending on the synonym list associated with the front end by the search administrator. Attributes T Format Text (HTML) Sub-Tags Definition The title of the search result Attributes TM Format Text (Floating-point number) Sub-Tags Definition Total server time to return search results, measured in seconds. Attributes U Format Text (Absolute URL) Sub-Tags Definition The URL of the search result. Attributes XT Format Sub-Tags Definition Indicates that the estimated total number of results specified in this search result is exact. Note: See the section on Automatic Filtering for more details. Attributes [TABLE OF CONTENTS] Appendices This section contains any appendices relevant to Google search: Estimated vs. Actual Number of Results URL Escaping Appendix A: Estimated vs. Actual Number of Results [APPENDICES] - [TABLE OF CONTENTS] The Google search engine does not guarantee the ability to return a particular number of results for any given search query. The total number of results provided by Google in the search results is an estimate of the actual number of results for the query. This number can be higher or lower than the actual number of results available. This section covers any issues relating to this topic. Behavior When a search request is made to Google, the following behavior occurs: 1. If Google has results to satisfy the search request, then the requested number of results will be returned. 2. If Google has results and the search request is for results beyond what is available, the last page of results will be returned. The last page of results is determined by dividing the total number of results into pages based on the number of results requested. 3. If no results are available for the search request, then an empty result set will be returned. In order to determine if a particular results page is the last page of available results, check for any of the following conditions: 1. The first result number returned does not match the first result number requested. 2. The number of results returned is less than the number of results requested. 3. The results returned do not contain a link to the next result set. Automatic Filtering Typically, the number of results actually returned is significantly reduced by the automatic filtering that Google performs on all search results to weed out undesirable results. This feature can be disabled per the instructions in the Automatic Filtering section. Any results which have been filtered will be identified in the results returned. For example, the <FI> XML tag will be present in any XML search results where automatic document filtering has occurred. Google recommends that the search results page display a message on the last page of the search results similar to the following message when automatic filtering occurs: In order to show you the most relevant results, we have omitted some entries very similar to the search results already displayed. If you like, you can repeat the search with the omitted results included. The underlined text in the message should be a hypertext link to submit the same search again with the filter parameter set to the value 0. Google has found that this method of informing users about automatic document filtering works well and is used on the Google Internet search site. Navigation When the total number of results returned is an estimate, the navigation structure for search results can be complicated. Google recommends two approaches for generating a navigation scheme for your search results: 1. Only provide the search user with the ability to navigate to the previous results page and the next results page. Google provides links to the previous and next result set in the results returned when appropriate. 2. Provide the search user with the ability to jump to any search page in the estimated number of results. If the user requests a results page beyond which results are actually available, the last results page will be returned and the navigation structure should be updated at that time. Google uses this approach on our Internet search site. Appendix B: URL Escaping [APPENDICES] - [TABLE OF CONTENTS] In order to make a search request to the Google search engine through an HTTP URL request, there are certain conventions that must be followed in order to allow the search engine to correctly translate your search request. The HTTP URL schema defines that only alphanumeric, the special characters $_.+!*'(), and the reserved characters ;/?:@=& can be used as values within an HTTP URL request. Since reserved characters are used by the search engine to decode the URL and some special characters are used to request search features, then all non-alphanumeric characters used as input parameter values should be URL escaped. In order to URL escape a string, all space characters should be converted to a "+" character and all other alphanumeric characters should be replaced by a "%" character followed by two hexadecimal digits representing the value of that character. Some input parameters require that the values passed to Google search will need to be double URL escaped. This means that you will need to apply the URL escaping to the string twice in succession to generate the final value. See the input parameter descriptions for more information. Note: Additional information on URL escaping can be found at W3C and IETF web sites. Examples Original String URL Escaped String chicken -teriyaki chicken+%2Dteriyaki admission form site:www.stanford.edu admission+form+site%3Awww.stanford.edu Original String Doubly URL Escaped String William Shakespeare William%2BShakespeare admission form admission%2Bform%2Bsite%253Awww.stanford.edu site:www.stanford.edu Glossary [TABLE OF CONTENTS] This glossary contains basic descriptions of acronyms and terms found in this document which may be new to some readers. Cached result - As part of its core technology, Google indexes all the content on a page, rather than a portion of the content (percentage or meta tags). Each page that is indexed is also available to be served in a cached HTML format (up to 4 million bytes of each document before HTML conversion). When a user views a cached document, each query term is highlighted in a different color, making it easy for the user to find the information sought. Because all pages are cached, the user always has access to content that has been indexed, even if the server where the live content is stored happens to be refusing connections or is slow to return the page. Collection - A collection is a subset or a view of the document index. Collections are specified by URL patterns; some collections are created automatically by the Google search engine. Collections are useful for allowing refined or advanced searches, for limiting access to classified information, for group-level security, for languagespecific queries and for many other applications. DTD - Document Type Definition. The purpose of a DTD is to define the legal building blocks of an XML document. It defines the XML document structure with a list of legal elements. Encoding Scheme - Each language has an official encoding scheme which is used to represent all of the language's characters in an 8-bit data stream format. These encoding schemes are used by Google search to determine how to translate incoming and outgoing search requests. KeyMatch - Because you occasionally may want to return special results for specific queries, Google search may be configured with the KeyMatch feature. Using KeyMatch, the search administrator can designate special results that are returned in addition to the standard results when specific queries are made. Google recommends using KeyMatch carefully, as it can drastically decrease the quality of results if overused. Meta Tags - HTML tags which can be specified within an HTML document which are not displayed to the end user, but which may contain document meta-data. Google search uses meta tags with the NAME attribute to enhance and filter search results when requested. MIME - Multipurpose Internet Mail Extensions. The MIME type of a web document (or search result) identifies the format of the document it is associated with. Some sample MIME types include "text/html" for HTML documents, and "application/msword" for Microsoft Word documents. Query - A string of query terms separated by the space character which is submitted to Google search. The results returned for a particular query will satisfy all query terms by default. Query term - A single term which defines a unit of search for the Google search engine to find in the index. A single query term can not contain any spaces or punctuation. UTF-8 - Unicode Transformation Format (8-bit). UTF-8 is a Unicode based encoding scheme for describing language data by representing the data using 8-bit codes. This encoding scheme is used by Google search to support multiple languages simultaneously. Web Directory - A subset of files on a web server stored under its own directory name. XML - eXtensible Markup Language. XML is a markup language, similar to HTML, which was designed to describe data. The tags used in XML are not pre-defined, and are described by a DTD or the data provider. XSL - eXtensible Stylesheet Language. XSL is a language that is designed to describe how an XML document should be displayed. XSL contains commands that can be used to describe the transformation and formatting of an XML document for display. XSL is used in the Google search environment to transform XML results into custom HTML output. XSLT - XSL Transformation. XSLT describes the process of transforming an XML document into another format. Google search allows search administrators to use our XSLT server to transform our standard XML results into their own custom HTML output.