Front Matter - VLDB Endowment

39th International Conference on Very Large Data Bases, Riva del Garda, Trento, Italy
Proceedings of the VLDB Endowment Volume 6, No. 8 – June 2013 Proceedings of the 39th International Conference on Very Large Data Bases, Riva del Garda, Trento, Italy Editors‐in‐Chief: Michael Böhlen, Christoph Koch Associate Editors – Research Track: Ashraf Aboulnaga, Sihem Amer‐Yahia, Chee Yong Chan, Yanlei Diao, Ada Waichee Fu, Johannes Gehrke, Alon Halevy, Jayant Haritsa, Nikos Mamoulis, Thomas Neumann, Dan Olteanu, Divesh Srivastava, Jens Teubner Associate Editor – Experiments and Analysis Track: Stefan Manegold Guest Editors: Sihem Amer‐Yahia, Stefan Manegold Proceedings Editors: Peer Kröger, Stratis D. Viglas PVLDB – Proceedings of the VLDB Endowment
Volume 6, No. 8, June 2013.
The 39th International Conference on Very Large Data Bases, Riva del Garda, Trento, Italy.
Copyright 2013 VLDB Endowment
Permission to make digital or hard copies of portions of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. Copyright for components of this work
owned by others than VLDB Endowment must be honored. Abstracting with credit is permitted. To copy
otherwise, to republish, to post on servers or to redistribute to lists requires prior specific permission
and/or a fee. Request permission to republish from PVLDB under email:
Volume 6, Number 8, June 2013: VLDB 2013
Pages ii - x and 541 - 600
ISSN 2150-8097
Additional copies only online at:,, and
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy
Front Matter
Copyright Notice ..................................................................................................................
Table of Contents ................................................................................................................
VLDB 2013 Organization and Review Board ...........................................................................
Letter from the Guest Editors ........................................... Sihem Amer-Yahia, Stefan Manegold
Research Papers
Hybrid Storage Management for Database Systems.................................................................
.......................................................................................................... Xin Liu, Kenneth Salem
Scorpion: Explaining Away Outliers in Aggregate Queries.........................................................
..................................................................................................Eugene Wu, Samuel Madden
Ratio Threshold Queries over Distributed Data Sources ...........................................................
............................................................... Rajeev Gupta, Krithi Ramamritham, Mukesh Mohania
On the Complexity of Query Result Diversification ...................................................................
.......................................................................................................... Ting Deng, Wenfei Fan
Streaming Quotient Filter: A Near Optimal Approximate
Duplicate Detection Approach for Data Streams ......................................................................
........................................................................... Sourav Dutta, Ankur Narang, Suman K. Bera
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy
General Chairs
Themis Palpanas, University of Trento
Yannis Velegrakis, University of Trento
Program Chairs
Michael Böhlen, University of Zurich
Christoph Koch, EPFL
Advisory Board
Paolo Atzeni, Universita Roma Tre
Stefano Ceri, Politecnico di Milano
John Mylopoulos, University of Trento
Award Committee
Surajit Chaudhuri, Microsoft (Chair)
Mike Carey, University of California, Irvine
Susan Davidson, University of Pennsylvania
Alon Halevy, Google
Sunita Sarawagi, IIT Bombay
Associate Editors
Ada Wai-Chee Fu, Chinese University of Hong Kong
Alon Halevy, Google
Ashraf Aboulnaga, University of Waterloo
Chee-Yong Chan, National University of Singapore
Dan Olteanu, Oxford University
Divesh Srivastava, AT&T Labs
Jayant Haritsa, Indian Institute of Science Bangalore
Jens Teubner, ETH Zurich
Johannes Gehrke, Cornell University
Nikos Mamoulis, University of Hong Kong
Sihem Amer-Yahia, Qatar Computing Research Institute
Stefan Manegold, CWI
Thomas Neumann, Technische Universität München
Yanlei Diao, University of Massachusetts Amherst
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy
Experiments and Analysis Track Associate Editor
Stefan Manegold, CWI
Industrial and Applications Track Associate Editors
Min Wang, HP Labs China
Cong Yu, Google Research
Demonstration Chairs
Jun Yang, Duke University
Dimitrios Gunopulos, University of Athens
Letizia Tanca, Politecnico di Milano
Reproducibility Chairs
Philippe Bonnet, IT University of Copenhagen
Juliana Freire, New York University
Dennis Shasha, New York University
Research Track Review Board
Karl Aberer, EPFL, Switzerland
Brian Cooper, Google
Foto Afrati, NTU Athens
Bin Cui, Peking University
Charu Aggarwal, IBM T. J. Watson Research Center
Carlo Curino, MIT
Yanif Ahmad, JHU
Sudipto Das, Microsoft Research
Jose-Luis Ambite, University of Southern California
Anish Das Sarma, Google Research
Walid Aref, Purdue University
Atish Das Sarma, eBay Research Labs
Magdalena Balazinska, University of Washington
Antonios Deligiannakis, Technical University of Crete
Srikanta Bedathur, IIIT Delhi
Amol Deshpande, University of Maryland
Peter Boncz, CWI
Xin Luna Dong, AT&T Labs-Research
Nico Bruno, Microsoft
Sameh Elnikety, Microsoft Research
Randal Burns, JHU
Mohamed Eltabakh, Worcester Polytechnic Institute
Andrea Cali, University of London, Birkbeck College
Alan Fekete, University of Sydney
Carlos Castillo, Yahoo!
Hakan Ferhatosmanoglu, Bilkent University
Gang Chen, Zhejiang University
Alvaro Fernandes, U. of Manchester
Lei Chen, Hong Kong University of Science and
Juliana Freire, New York University
Benjamin C. M. Fung, Concordia University
Shimin Chen, HP Labs China
Fabien Gandon, INRIA
James Cheng, CUHK
Reynold Cheng, University of Hong Kong
Minos Garofalakis, Technical University of Crete,
Gao Cong, Nayang Technological University
Buğra Gedik, Bilkent University
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy
Rainer Gemulla, Max-Plack-Institut Saarbrücken
Paul Larson, Microsoft
Gabriel Ghinita, University of Massachusetts Boston
Mong-Li Lee, National University of Singapore
Parke Godfrey, York University
Wang-Chien Lee, Penn State University
Michaela Goetz, Cornell University
Wolfgang Lehner, Technische Universität Dresden
Lukasz Golab, University of Waterloo
Chengkai Li, The University of Texas at Arlington
Sergio Greco, University of Calabria
Cuiping Li, Renmin University of China
Le Gruenwald, University of Oklahoma
Feifei Li, University of Utah
Krishna Gummadi, MPI
Guoliang Li, Tsinghua University
Haryadi Gunawi, University of California, Berkeley
Lipyeow Lim, University of Hawaii at Manoa
Rahul Gupta, IIT Bombay
Xuemin Lin, University of New South Wales
Marios Hadjielefhteriou, AT&T labs
Eric Lo, The Hong Kong Polytechnic University
Kuno Harumi, HP Labs
Boon Thau Loo, University of Pennsylvania
Michael Hay, Cornell
Qiong Luo, Hong Kong University of Science and
Bingsheng He, NTU Singapore
Ashwin Machanavajjhala, Duke University
Sven Helmer, Free University of Bozen-Bolzano
Sanjay Madria, University of Missouri-Rolla
Howard Ho, IBM Almaden Research
Amélie Marian, Rutgers University
Katja Hose, Aalborg University
Frank McSherry, Microsoft
Bill Howe, University of Washington
Sharad Mehrotra, University of California, Irvine
Jeong-Hyon Hwang, State University of New York,
Poess Meikel, Oracle
Stratos Idreos, CWI
Mohamed Mokbel, University of Minnesota
Hans-Arno Jacobsen, University of Toronto
Bongki Moon, University of Arizona
Ricardo Jimenez-Peris, Technical University of Madrid
Kyriakos Mouratidis, Singapore Management
Ruoming Jin, Kent State University
Gero Muhl, University of Rostock
Ryan Johnson, University of Toronto
Karin Murthy, IBM Research
Vanja Josifovski, Yahoo Inc.
Suman Nath, MSR
Panos Kalnis, King Abdullah University of Science and
Wolfgang Nejdl, University of Hannover
Vana Kalogeraki, Athens Univ. of Econ. and Business
Sylvia Nittel, University of Maine
Carl-Christian Kanne, University of Mannheim
Beng Chin Ooi, National University of Singapore
Hillol Kargupta, University of Maryland Baltimore
Tamer Ozsu, University of Waterloo
Esther Pacitti, University of Montpellier
Yiping Ke, Institute of High Performance Computing
Ippokratis Pandis, IBM Almaden
Anne-Marie Kermarrec, INRIA
Olga Papaemmanouil, Brandeis University
Daniel Kifer, PSU
Srinivasan Parthasarathy, The Ohio State University
Changkyu Kim, Intel
Jignesh Patel, University of Wisconsin
George Kollios, Boston University
Peter Pietzuc, Imperial College London
Christian König, Microsoft Research
Neoklis Polyzotis, University of California, Santa Cruz
Laks V. S. Lakshmanan, University of British Columbia
PVLDB Vol. 6 No. 8
Lucian Popa, IBM Research
VLDB2013 – Riva del Garda, Trento, Italy
Bordawekar Rajesh, IBM T.J. Watson
Evimaria Terzi, University of Boston
Vibhor Rastogi, Yahoo
Martin Theobald, Max Planck Institute, Germany
Christopher Re, University of Wisconsin, Madison
Anthony Tung, National University of Singapore
Matthias Renz, Ludwig-Maximilians University Munich,
Kostas Tzoumas, Technical University of Berlin
Marie-Christine Rousset, IMAG
Sergei Vassilvitskii, Google
Stratis D. Viglas, University of Edinburgh
Sourav S. Bhowmick, Nayang Technological University
Dimitris Sacharidis, IMIS Athena, Greece
Ke Wang, Simon Fraser University
Ingmar Weber, Yahoo!
Kenneth Salem, Univesity of Waterloo
Maria Sapino, University of Torino
Raymond Chi-Wing Wong, Hong Kong University of
Science and Technology
Monica Scannapieco, Istat
Xiaokui Xiao, NTU
Bernhard Seeger, Philipps-Universität Marburg
Dong Xin, Google
Pierre Senellart, Télécom ParisTech
Xifeng Yan, University of Santa Barbara
Cyrus Shahabi, USC
Jiong Yang, Case Western Reserve University
Lidan Shou, Zhejiang University
Ke Yi, Hong Kong University of Science and
Adam Silberstein, Trifacta
Man Lung Yiu, Hong Kong Polytechnic University
Radu Sion, Stony Brook University
Cong Yu, Google Research
Yannis Sismanis, IBM, USA
Ge Yu, Northeastern University, China
Mohamed Soliman, University of Waterloo
Jeffrey Yu, Chinese University of Hong Kong
Julia Stoyanovich, Drexel University and Skoltech
Wenjie Zhang, UNSW Australia
Yufei Tao, Chinese University of Hong Kong
Baihua Zheng, Singapore Management University
Sandeep Tata, IBM Research
Aoying Zhou, East China Normal University
Nesime Tatbul, ETH Zurich
Xiaofang Zhou, University of Queensland
Demonstration Program Committee
Anastasia Ailamaki, EPFL
Nick Koudas, University of Toronto
Sihem Amer-Yahia, Qatar Computing Research
Nikos Mamoulis, University of Hong Kong
Giansalvatore Mecca, Università della Basilicata
Leopoldo Bertossi, University of Carleton
Alexandra Meliou, University of Washington
Francois Bry, University of Munich
Rachel Pottinger, University of British Columbia
Chee-Yong Chan, National University of Singapore
Rajeev Rastogi, Yahoo! India
Kevin Chang, UIUC
Bernhard Seeger, University of Marburg
Chin-Wan Chung, Korea Advanced Institute of SaT
Ambuj Singh, University of California, Santa Barbara
Gautam Das, University of Texas, Arlington
Jens Teubner, ETH Zurich
Aris Gkoulalas-Divanis, IBM Research Ireland
Wei Wang, University of New South Wales
Torsten Grust, Universität Tübingen
Li Xiong, Emory University
Herodotos Herodotou, Microsoft Research
Jia Yuan Yu, IBM Research
Yoshiharu Ishikawa, Nagoya University
Demetris Zeinalipour, University of Cyprus
Flip Korn, AT&T Labs
PVLDB Vol. 6 No. 8
Shuigeng Zhou, Fudan University
VLDB2013 – Riva del Garda, Trento, Italy
Industrial Track Committee
Michael Brodie, Verizon
Felix Naumann, University of Potsdam
Alejandro Buchmann, Technische Universität
Fatma Ozcan, IBM Research
Shimin Chen, HP Labs China
Radu Popescu-Zeletin, Fraunhofer-Institut für Offene
Umeshwar Dayal, HP Labs
Raghu Ramakrishnan, Microsoft
Shel Finkelstein, SAP
Jun Rao, LinkedIn
Dieter Gawlick, Oracle
Len Seligman, MITRE
Tasos Kementsietsidis, T.J. Watson Research Center
Eric Simon, SAP
Tim Kraska, Brown University
Haixun Wang, Microsoft Research
Yue Lu, twitter
Fei Wu, Google Research
Arnab Nandi, The Ohio State University
Jackie Xiang, Foursquare
Reproducibility Committee
Matias Bjørling, IT University of Copenhagen
Mian Lu, Hong Kong University of Science and
Wei Cao, Remnin University
Dan Olteanu, University of Oxford
Stratos Idreos, Centrum Wiskunde & Informatica
Paolo Papotti, Qatar Computing Research Institute
Ryan Johnson, University of Toronto
Martin Kaufmann, ETH Zurich
Ben Sowell, Cornell University
David Koop, University of Utah
Lucja Kot, Cornell University
Radu Stoica, EPFL - Ecole Polytechnique Federale de
Willis Lang, University of Wisconsin
Dimitris Tsirogiannis, Microsoft Jim Gray Systems Lab
PhD Workshop Chairs
Tutorial Chairs
Angela Bonifati, Icar-CNR
Serge Abiteboul, INRIA
Sanjay Chawla, University of Sydney
Gianni Mecca, Universita della Basilicata
Chris Jermaine, Rice University
Haixun Wang, Microsoft Research Asia
Panel Chairs
Sponsorship Chairs
Shivnath Babu, Duke University
Sam Madden, Massachusetts Institute of Technology
Stavros Harizopoulos, Nou Data
Vassilis Vassalos, Athens Univ. of Econ. and Business
Ihab Ilyas, Qatar Computing Research Institute
Paolo Merialdo, Universita Roma Tre
Publicity Chair
Proceedings Chairs
Tasos Kementsietsidis, IBM T.J. Watson Research Center
Peer Kröger, Ludwig-Maximilians University, Munich
Stratis D. Viglas, University of Edinburgh
Web Management Chair
Treasury Chair
Francesco Guerra, University of Modena and Reggio Emilia
Marios Hadjieleftheriou, AT&T Labs Research
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy
Local Administration
Logo Design
Ufficio Convegni and dbTrento Group, University of Trento
Sakis Palpanas
PVLDB Information Director
Gerald Weber, University of Auckland
PVLDB Advisory Committee
Philip Bernstein, Michael Böhlen, Peter Buneman, Susan Davidson, Z. Meral Ozsoyoglu, S. Sudarshan, Gerhard
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy
It is our pleasure to present the eighth issue of Proceedings of the VLDB Endowment (PVLDB) Volume 6.
This issue contains five papers, all from the Research Track. Topics include data analytics, novel storage,
distributed query optimization, diversity complexity and data streams. These and other papers will be
presented at the VLDB 2013 Conference, to be held in Riva del Garda, Trento, Italy, August 26-30, 2013.
All of these papers were reviewed in a journal-style review and revision process with monthly year-round
submission cycles.
As Associate Editors, we have enjoyed overlooking the review process of the excellent papers accepted to
VLDB 2013. We thank the authors for their submissions and for their willingness to incorporate obtained
feedback in time. We also thank the Review Board for ensuring scientific and technical quality of the
papers with thorough reviews and on-time discussions. It has been a pleasure to work with an
outstanding set of editorial colleagues. We hope that this new process will continue in the next years.
These proceedings would not have been possible without the continuous involvement of our colleagues in
the research community. We are proud to be part of such a team and we look forward to seeing you in
Riva del Garda.
Sihem Amer-Yahia, CNRS
Stefan Manegold, CWI
Associate Editors, PVLDB 2013
PVLDB Vol. 6 No. 8
VLDB2013 – Riva del Garda, Trento, Italy