Organization - Web Science and Knowledge Management

advertisement
WWW’09 Workshop Proposal:
Semantic Search
Marko Grobelnik, Jožef Stefan Institute, Ljubljana, Slovenia
Peter Mika, Yahoo! Research, Barcelona, Spain
Thanh Tran Duc, Institute AIFB, University of Karlsruhe (TH), Germany
Haofen Wang, Apex Data & Knowledge Management Lab, Shanghai Jiao Tong University, China
Executive Summary
Semantic technologies, namely expressive ontology and resource description languages, scalable
repositories, reasoning engines and information extraction techniques are now in a mature state
such that they can be applied to enable a higher level of semantic underpinning in real-world
Information Retrieval (IR) systems. This application of semantic technologies to IR tasks is typically
referred to as Semantic Search. Challenges on this way include (i) identifying tasks and paradigms for
semantic search systems, (ii) devising expressive annotation frameworks as well as scalable
algorithms and infrastructures, (iii) investigating innovative query paradigms for semantic search
systems, and (iv) applying machine learning and information extraction techniques in the context of
semantic search.
Topic and Scope
In recent years we have witnessed tremendous interest and substantial economic exploitation of
search technologies, both at web and enterprise scale. However, the representation of user queries
and resource content in existing search appliances is still almost exclusively achieved by simple
syntax-based descriptions of the resource content and the information need such as in the
predominant keyword-centric paradigm (i.e. keyword queries matched against bag-of-words
document representation). While these systems have shown to work well for topical search, i.e.
retrieve document based on a topic, they work on the basis of rough approximations and usually fail
to address more complex information needs.
On the other hand, recent advances in the field of semantic technologies have resulted in tools and
standards that allow for the articulation of domain knowledge in a formal manner at a high level of
expressivity. At the same time, semantic repositories and reasoning engines have only now advanced
to a state where querying and processing of this knowledge can scale to realistic IR scenarios. As
such, semantic technologies are now in a state to provide significant contributions to IR problems.
More expressive descriptions of resources can be achieved through the conceptual representation of
the actual resource content and the collaborative annotation of general resource metadata using
standard Semantic Web languages. As a result, there is high potential that complex information
needs can be supported by the application of semantic web technologies to IR, where expressive
queries can be matched against expressive resource descriptions.
In parallel to these developments, in the past years we have also seen the emergence of important
results in adapting ideas from IR to the problem of search in RDF/OWL data, folksonomies,
microformat collections or semantically tagged natural text. Common to these scenarios is that the
search is focused not on a document collection, but on metadata (which may be possibly linked to or
embedded in textual information). Search and ranking in metadata stores is another key topic
addressed by this workshop.
The immediate relevance of the topic for the Semantic / Data Web track as addressed at the
WWW’09 conference arises from two aspects. On the one hand, search technology is a dominant
technology for direct interaction with end-users, both in web and enterprise settings. However,
research efforts in the Semantic Web community in recent years have largely targeted other fields.
On the other hand, the success of search engines like GoogleTM, which do not explicitly utilize
semantic technologies, challenges the predominant public notion of the Semantic Web as a web that
“will yield better search results”. WWW is the best place for this workshop as it covers
interdisciplinary topics between Semantic Web and search. Recent trends, such as a significant
number of publications at ISWC+ASWC’07 and ESWC’08 that would fit into the workshop scope
support the need for a forum that explicitly targets Semantic Search. In particular, our previous
workshop (i.e. the first workshop on “semantic search”) was among the biggest ones at ESWC’08 and
has attracted the highest number of submissions.
Challenges
In this context, several challenges arise for Semantic Search systems. These include, among others:






How can semantic technologies be exploited to capture the information need of the user?
How can the information need of the user be translated to expressive formal queries without
enforcing the user to be capable of handling the difficult query syntax?
How can expressive resource descriptions be extracted (acquired) from documents (users)?
How can expressive resource descriptions be stored and queried efficiently on a large scale?
How can vague information needs and incomplete resource descriptions be handled?
How can semantic search systems be evaluated and compared with standard IR systems?
Topics
Main topics of interest for the envisioned workshop contributions include (but are not limited to) the
following areas:
Tasks and interaction paradigms for semantic search




Information retrieval tasks on the semantic web
Incentives and interaction paradigms for resource annotation
Interaction paradigms for semantic search
Collaborative aspects of semantic search (wikis, social networks)
Query construction and resource modelling for semantic search






Semantic technologies for query interpretation, refinement and routing
Natural language interfaces for semantic web repositories
Modelling expressive resource descriptions
Ontology and metadata Standards for expressive resource descriptions
Natural language processing and information extractions for the acquisition of resource
descriptions
Semantic web mining and semantic network analysis
Algorithms and infrastructures for semantic search





Scalable reasoners, repositories and infrastructures for semantic search
Crawling, storing and indexing of expressive resource descriptions
Fusion of semantic search results on the semantic web
Algorithms for matching expressive queries and resource descriptions
Algorithms and reasoning procedure to deal with vagueness, incompleteness and
inconsistencies in semantic search
Evaluation of semantic search


Evaluation methodologies for semantic search
Standard datasets and benchmarks for semantic search
Community and related activities
Intended Audience
The workshop is of interest for researchers in Semantic Web, Information Retrieval, Information
Extraction and User interaction with research interests at the intersection of these fields. The
following research projects address or partially address these fields and respective project
coordinators have indicated they wish to act as sponsors, to advertise the workshop amongst their
members and promote attendance.





X-Media - Knowledge Sharing and Reuse across Media (EU IST IP)
NEON - Lifecycle Support for Networked Ontologies (EU IST IP)
PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning (EU IST NOE)
ACTIVE - Enabling the Knowledge Powered Enterprise (EU ICT IP)
THESEUS (German Federal Ministry of Economy and Technology Research Program)
The EU IP projects LarKC, OKKAM, and WeKnowIt as well as the EU STREP projects SMARTMUSEUM,
KIWI, and JUMAS will provide additional support in dissemination.
Recent related Events
The following recent events have addressed issues related to the topic of Semantic Search:








Workshop on Semantic Search, ESWC 2008, Tenerife, Spain
Workshop on Web Search Technology, ASWC 2006, Beijing, China
Workshop on Learning in Web Search, ICML 2005, Bonn, Germany
Workshop on Learning and Extending Lexical Ontologies by using Machine Learning Methods,
ICML 2005, Bonn, Germany
Workshop on Knowledge Discovery and Ontologies at ECML 2005, Porto, Portugal
2nd European Web Mining Forum at ECML/PKDD 2005, Porto, Portugal
Workshop on Mining for and from the Semantic Web at KDD 2004, Seattle
Workshop on Semantic Network Analysis at ISWC 2005, Galway
Organization
The workshop will preferably be held as a full day workshop and will feature two invited talks. These
will be one hour each, for the morning and afternoon. As a networking opportunity, the workshop
will also devote one hour to presentations of related research projects, their view on Semantic
Search and possible synergies. Submissions will be thoroughly reviewed by at least three reviewers,
two of which should represent the two main perspectives i.e. Semantic Technologies and Information
Retrieval. We will publicize the workshop via mailing lists (the respective W3C, Semantic Web,
Information Retrieval lists and available related project lists) and to addresses of participants of
previously held related workshops. Additionally we will provide links from the homepages of our
institutes and of related projects. Important dates will be aligned with the overall WWW
organization.
Program Committee
We target a balanced program committee that includes experienced researchers with a strong
research record in the relevant research areas of Semantic Technologies and Information Retrieval
(including Information Extraction, Text Mining and Multimedia Retrieval). Specifically, the following
people have indicated their willingness to review submissions and to disseminate the workshop:






















Bettina Berendt, Univerity Leuven, Belgium
Paul Buitelaar, DFKI Saarbrücken, Germany
Wray Buntine, NICTA Camberra, Australia
Pablo Castells, Universidad Autónonoma de Madrid, Spain (to be confirmed)
Philipp Cimiano, Institute AIFB, University of Karlsruhe, Germany
Fabio Ciravegna, University of Sheffield, UK
Blaz Fortuna, Jozef Stefan Institute, Slovenia
Lise Getoor, University Maryland, USA
Rayid Ghani, Accenture Labs, USA
Peter Haase, Institute AIFB, University of Karlsruhe, Germany
Andreas Hotho, University of Kassel, Germany
Esther Kaufmann, University of Zurich, Switzerland
Yiannis Kompatsiaris, Informatics and Telematics Institute, Greece
Eduarda Mendes Rodrigues, Microsoft Research, Cambridge, UK
Steffen Staab, University of Koblenz-Landau, Germany
Nenad Stojanovic, FZI Karlsruhe, Germany
Rudi Studer, Institute AIFB, University of Karlsruhe, Germany
Raphael Volz, FZI Karlsruhe, Germany,
Michael Witbrock, Cycorp, USA
Ilya Zaihrayeu, University of Trento, Italy
Hugo Zaragoza, Yahoo! Research Barcelona, Spain
Yong Yu, Shanghai Jiao Tong University, China
Background on workshop organizers
In the following, we provide background information about the workshop organizers.
Marko Grobelnik
Jožef Stefan Institute, Department for Intelligent Systems
Jamova 39, SLO-1000 Ljubljana, Slovenia
Office Phone: +386.61.1773-778; eMail: Marko.Grobelnik@ijs.si
http://www-ai.ijs.si/MarkoGrobelnik/
Marko is researcher and manager of research group of 15 people at the department of Knowledge
Department working primarily in the areas of text-mining and social network analysis. He is coauthor of several books and numerous scientific papers. Marko is a technical director of FP6 IST
World project on analysis of European research, a member of management board of several FP6
projects and participates in W3C standardizing committees. He co-organized over 10 international
workshops and tutorials on text mining and link analysis at prominent conferences like IJCAI, ACMKDD, IEEE-ICDM. Marko also closely collaborates on research projects with Microsoft Research,
Cycorp Europe, Carnegie Mellon University and Cornell University.
Peter Mika
Yahoo! Research, Barcelona Lab
Ocata 1, 1st floor, E-08003, Barcelona, Spain
Office Phone: +34.935.421-165; eMail: pmika@yahoo-inc.com
http://research.yahoo.com/Peter_Mika
Peter is researcher at Yahoo! Research, Barcelona. He obtained his PhD in 2007 from the Business
Informatics group of the Faculty of Sciences (FEW) at the Vrije Universiteit, Amsterdam. His research
focus is on Search Technologies, Semantic Technologies and Social Networks. His interdisciplinary
work in the field of Social Networks and the Semantic Web earned a Best Paper Award at the
International Semantic Web Conference in Galway, 2006 and a First Prize at the Semantic Web
Challenge of 2005. He is also author of the book “Social Networks and the Semantic Web”, published
in 2007 by Springer Verlag. He has been involved in several large European Semantic Web projects
such as On-To-Knowledge, SWAP (Semantic Web and Peer-to-Peer) and WonderWeb.
Thanh Tran Duc
University of Karlsruhe, Institute AIFB, Knowledge Management Research Group
D-76128 Karlsruhe, Germany
Office Phone: +49.721.608-7363; eMail: dtr@aifb.uni-karlsruhe.de
Thanh is research associate and PhD student at the Institute AIFB, University of Karlsruhe (TH). He
has received two awarded degrees, a Master of Commerce at the Macquarie University, Australia
and a Master of Business Information Systems at the Otto von Guericke University. He has worked as
project associate and software engineer for IBM and Capgemini. His interdisciplinary work in the field
of Knowledge Representation, Database and Information Retrieval is published in numerous
proceedings and journals (ICDE, WWW; ISWC). He is currently involved in a large European Semantic
Web called X-Media.
Haofen Wang
Shanghai Jiao Tong University, Apex Data & Knowledge Management Lab
800, Dongchuan Road Shanghai, China
Office Phone: +86.21.5474-5879; eMail: whfcarter@apex.sjtu.edu.cn
http://apex.sjtu.edu.cn/apex_wiki/whfcarter
Haofen is research associate and PhD student at the Apex Data & Knowledge Management Lab,
Shanghai Jiao Tong University. He received his master in Computer Science and Engineering from
Shanghai Jiao Tong University. His research interests include semantic data creation & integration,
Semantic Web Data Indexing & Search and Query Interface & User Interaction for the Semantic Web.
He has published several high-quality papers and has served as program committee member and
reviewer for various conferences and journals on these topics. Haofen also successfully took charge
of several joint research projects with IBM China Research laboratory and Intel Research China.
Download