IEEE Xplore® XML Search API User Documentation Revision 4.2 January, 2011 IEEE Piscataway, NJ 08854 IEEE Xplore XML Search API Page 1 of 13 Table of Contents Introduction ..................................................................................................................... 3 XML Gateway Query Parameters ................................................................................. 4 XML Gateway Query Examples ................................................................................................................. 6 Author Search ....................................................................................................................................... 6 Title Search............................................................................................................................................ 6 Abstract Search ..................................................................................................................................... 6 Publication Title Search......................................................................................................................... 6 DOI Search............................................................................................................................................. 6 ISBN/ISSN Search .................................................................................................................................. 6 Publication Year Search ........................................................................................................................ 6 Use of pys and pye parameters to restrict results by year range. ........................................................ 6 Use of hc & rs parameters for pagination............................................................................................. 6 Complex queries on all or selected metadata fields, abstract and document text .............................. 7 Restrictions on queries ............................................................................................................................. 7 Tips for migrating from XML Gateway version 3.x ..................................................... 8 XML RESPONSE ........................................................................................................... 9 Search Query Results .............................................................................................................................. 10 Single Article Query Results .................................................................................................................... 11 Description of the XML nodes returned. ................................................................................................ 12 IEEE Xplore XML Search API Page 2 of 13 Introduction As part of IEEE’s continuing efforts to improve ease of access to the IEEE Xplore Digital Library, IEEE makes available an application programming interface (API) called the IEEE XML Gateway. The XML Gateway allows IEEE customers and 3rd parties such as federated search vendors to query the IEEE Xplore content repository and retrieve results in XML format for manipulation and presentation on local web interfaces. The XML Gateway URL is : http://ieeexplore.ieee.org/gateway This is a help page that displays the available query parameters as well as some sample queries. Queries sent to the gateway should be of the form: http://ieeexplore.ieee.org/gatewayipsSearch.jsp?key1=value1[&key2=value2&....] At least one key-value pair (kvp) specifying the search terms should be present in the query URL. In addition, kvps may be added to specify the number of records to be brought back and the record from which to start, to enable paging of the results, as well as to sort the results in ascending or descending order on specific fields. IEEE Xplore XML Search API Page 3 of 13 XML Gateway Query Parameters The following table lists the supported parameter keys accepted by the XML Gateway. Parameter Values Boolean query field name Definition “Article Number” Article Number Parameter specifying single article an String Parameters specifying the search terms au String Author Author Name ti String “Document Title” Article Title ab String Abstract Abstract doi String DOI DOI cs String “Author Affiliation” Affiliations jn String “Publication Title” Publication Title isbn String ISBN ISBN issn String ISSN ISSN py Integer “Publication Year” Publication Year thsrsterms String “Thesaurus Terms” IEEE Thesaurus Terms cntrlterms String “INSPEC Controlled Terms” INSPEC Controlled Terms idxterms String NA Single term & complex queries on available index term fields. IEEE Xplore XML Search API Page 4 of 13 md querytext String String NA Single term & complex queries on available metadata fields and abstract. NA Single term & complex queries on available metadata fields, abstract and document text. Parameters for filtering results. pys Integer NA Start (Publication) Year pye Integer NA End (Publication) Year pu String NA Publisher. One of IEEE/AIP/IET/AVS/IBM NA Content Type. One of Conferences/Journals/Books/Early Access/Standards/Educational Courses ctype String Parameters for sorting results sortfield String NA One of au/ti/doi/cs/jn/py sortorder String NA asc (for ascending) or desc (for descending) Parameters for paging hc Number NA No of records to return. Default 25; Max. 200. rs Number NA Index of record at which to start. Default 1. Note: The parameters are not case-sensitive. IEEE Xplore XML Search API Page 5 of 13 XML Gateway Query Examples Author Search Return the first 10 records with smith in the author name: http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?au=smith&hc=10&rs=1 Title Search Return records 11 to 25 with java in the Document title: http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?ti=java&hc=15&rs=11 Abstract Search Return the first 20 articles with digital in the Abstract, sorting by descending order on Document Title: http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?ab=digital&sortfield=ti&sortorder=desc&hc=20&rs=1 Publication Title Search Return the first 20 articles with solid-state in the Publication Title: http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?ab=solid-state&hc=20&rs=1 DOI Search Return articles with matching DOI NUMBER: This is a Unique “Digital Object Identifier” so most likely it will return one article part of the result set. http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?doi=10.1109/TAP.1961.1144990&hc=20&rs=1 ISBN/ISSN Search http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?isbn=1089-7801. http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?isbn=0-7695-1089-2. Publication Year Search Return articles published in specified year. http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?py=2005 Use of pys and pye parameters to restrict results by year range. http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?pys=1995&pye=1998. Use of hc & rs parameters for pagination These parameters are used to limit the number of records brought back by a single query but still allow all relevant results to be fetched by multiple queries if required. The parameter hc specifies the number of results to be brought back by a query while the parameter rs specifies the index of the record at which to start. IEEE Xplore XML Search API Page 6 of 13 The following search returns results 1 to 20. http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?ab=digital&hc=20&rs=1 The following search returns results 21 to 40, to display the next page. http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?ab=digital&hc=20&rs=21 rs defaults to 1 and hc defaults to 25. The maximum allowed value for hc is 200. If a higher value is passed, it will be treated as 200. Complex queries on all or selected metadata fields, abstract and document text The parameters md or querytext may be used when it is desired to search on all searchable metadata without specifying individual fields and also to formulate complex queries using the AND, OR and NEAR operators on different fields including structuring queries using parentheses. A search using md will include all metadata and abstract, while a search using querytext will also include the document text in addition. Some example queries are shown below: Search for rfid in all metadata: http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?querytext=rfid. Multi term Phrase Query : http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?querytext=”Antennas and propagation” Note that the quotes are required to indicate this is a phrase search. Without quotes, the query would bring back results containing antennas and propagation in any order in any combination of fields. Note also that without quotes, and is not considered part of the query, since this and a number of other frequently occurring, but generally irrelevant words are configured as stop words Multi term Boolean Query: http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?querytext=((java AND Xml) OR ood) Multi Fielded Complex Query using Verity Query Language http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?querytext=(Abstract:java OR “Document Title”:Xml) Note: The full field name (as shown in the third column of the table above) should be specified in Boolean queries and if the field names contain multiple words, they should be enclosed in quotes – eg: “Document Title” Restrictions on queries 1. The query should contain values for at least one of the following parameters: an/querytext/md/idxterms/ab/ti/au/cs/jn/doi/py/pys/pye IEEE Xplore XML Search API Page 7 of 13 2. Wildcards (*) may be used for query expansion; however any query term may only contain a maximum of two wildcards. eg: ti=comp* program* is valid, but ti=comp* program* dev* is not. 3. Wildcarded words should contain at least 3 valid characters. Eg: com* is valid, but co* is not. 4. Only a maximum of 200 results may be retrieved in a single query. If the parameter hc is set to a value more than 200, it will be reset to 200. 5. A query term can only contain a maximum of 10 words. If a term has more than 10 words, the query will be truncated at the 10th word. Tips for migrating from XML Gateway version 3.x The update to the XML Gateway was necessitated since Endeca replaced Verity as the search engine powering IEEE Xplore website. In order to minimize inconvenience to existing users of the Gateway, the query syntax and response format has been preserved as far as possible. However, some differences in the behavior of the search engines may imply that some user queries may have to be modified. Some important changes are listed below: 1. If multiple search fields are used in a query their results are ‘AND’ed – i.e. ti=java&ab=xml, returns results matching java in the title AND xml in the abstract. 2. The search terms are automatically expanded using a stemming dictionary and any thesaurus entries configured by IEEE staff. So searches for run, runs, ran and running will all return the same results. 3. Multi-word searches are not considered as phrases. So a search java xml in title will return records containing java and xml anywhere in the title, not necessarily as a phrase. Phrase matches are considered more relevant and will be shown first if no sort order is specified. 4. If no records match all the words in a multi-word search, records containing partial matches will be returned. However, the records should match at least two words from the query. So a two-word query will never return records matching only one of the words, but a three-word query may return records matching any two of the words if there are no records matching all three words. Also, partial matches should match at least two less than the total number of words in the search query – i.e. while a four-word query may return records matching two of the words, a five-word query must match at least three of the words. 5. When multi-word searches are used for querytext, cross-field matches are also valid. So for querytext=java xml, records containing java in the title and xml in the abstract are valid, though they are considered less relevant than records containing both words in either the title or the abstract. 6. In order to specify phrase search, the search terms should be enclosed in double quotes. 7. Non-alphanumeric characters (except for &, + and /) are replaced by spaces in the queries. So a search ti=two-way will return records containing two-way or two way in title. Note that the words separated by special characters will be considered as phrases, so records containing way two or two roadblocks along the way will not be returned. IEEE Xplore XML Search API Page 8 of 13 8. Certain commonly occurring, but irrelevant words, such as a, and, you, etc. are configured as stop words and will not contribute to the query unless they are enclosed in quotes. So, the search java and is equivalent to java, but not to “java and”. 9. If no sorting is specified, and the search is on all metadata, the records are returned according to relevance. Briefly, relevance is determined by: - Number of terms matched in a multi-word query , - Rank of the field in which the match was found (eg. a match in Document Title is more relevant than a match in Abstract), - Whether multi-word terms were matched as a phrase - Number of matches found in all the fields. 10. Boolean searches use a different syntax. Old version New version java <in> title “Document Title”:java java <or> xml java OR xml java <and> xml java AND xml java <not> xml java NOT xml (java <not> xml) <in> ti “Document Title”:java NOT “Document Title”:xml XML RESPONSE Note that the search may at times return content containing characters that cause problems with the HTTP protocol. As a workaround to address this issue, the Gateway wraps all node values with the [CDATA] tag. IEEE Xplore XML Search API Page 9 of 13 Search Query Results The format of the results returned for a search is shown below. A number is represented by 9999 and a string is represented by XXXX. <root> <totalfound>9999</totalfound> <totalsearched>9999</totalsearched> <document> <rank>9999</rank> <title>XXXX</title> <authors>XXXX; XXXX; XXXX;</authors> <thesaurusterms> <term>XXXX</term> <term>XXXX<term> </thesaurusterms> <controlledterms> <term>XXXX</term> <term>XXXX<term> </controlledterms> <pubtitle>XXXX</pubtitle> <punumber>9999</punumber> <pubtype>Conference or Journal or Standard</pubtype> <volume>XXXX</volume> <issue>XXXX</issue> <part>XXXX</part> <py>XXXX</py> <spage>XXXX</spage> <epage>XXXX</epage> <abstract>XXXX</abstract> <issn>XXXX</issn> <isbn>XXXX</isbn> <pdf>XXXX</pdf> <arnumber>9999</arnumber> </document> </root> IEEE Xplore XML Search API Page 10 of 13 Single Article Query Results A query for a single article using the an parameter will return results in the format shown below. <root> <document> <title>XXXX</title> <authors> <name>XXXX</name> <order>9999</order> </authors> <thesaurusterms> <term>XXXX</term> <term>XXXX<term> </thesaurusterms> <controlledterms> <term>XXXX</term> </controlledterms> <uncontrolledterms> <term>XXXX</term> </uncontrolledterms> <pubtitle>XXXX</pubtitle> <punumber>9999</punumber> <pubtype>Conference or Journal or Standard</pubtype> <py>XXXX</py> <spage>XXXX</spage> <epage>XXXX</epage> <abstract>XXXX</abstract> <issn>XXXX</issn> <isbn>XXXX</isbn> <volume>XXXX</volume> <issue>XXXX</issue> <part>XXXX</part> <arnumber>9999</arnumber> <doi>XXXX</doi> <mdurl>http://XXXX</mdurl> <pdf>http://XXXX</pdf> </document> </root> IEEE Xplore XML Search API Page 11 of 13 Description of the XML nodes returned. TAG DESCRIPTION rank Results are ordered by relevance. Rank 1 is the most relevant, rank 2 the second most relevant and so on. title Article Title authors Semicolon delimited list of author names. (Search Results) name Author Name order Order where the authors name appears in the listing of authors. pubtitle Publication that the article appears in. punumber IEEE identifier for the publication pubtype The IEEE has 3 types of publications Journals, Conferences and Standards. volume Volume issue Issue part Part py publication year spage start page epage end page abstract First 250 words of the abstract issn ISSN isbn ISBN pdf URL to the full text of the document arnumber unique article number for the document thesaurusterms terms from the IEEE thesaurus IEEE Xplore XML Search API Page 12 of 13 controlledterms terms from an INSPEC controlled thesaurus uncontrolledterms terms not from the INSPEC thesaurus term Term Not that some documents or results may not have all the information specified above. In that case some those tags will not appear in the resulting XML. IEEE Xplore XML Search API Page 13 of 13