- IEEE Projects In Madurai

advertisement
Query Aware Determinization of Uncertain Objects
Abstract
The problem of determinizing probabilistic data to enable such data to be stored in legacy
systems that accept only deterministic input. Probabilistic data may be generated by automated
data analysis/enrichment techniques such as entity resolution, information extraction, and speech
processing. The legacy system may correspond to pre-existing web applications such as Flickr,
Picasa, etc. The goal is to generate a deterministic representation of probabilistic data that
optimizes the quality of the end-application built on deterministic data. We explore such a
determinization problem in the context of two different data processing tasks -- triggers and
selection queries. We show that approaches such as thresholding or top-1 selection traditionally
used for determinization lead to suboptimal performance for such applications. Instead, we
develop a query-aware strategy and show its advantages over existing solutions through a
comprehensive empirical evaluation over real and synthetic datasets.
Download