Probabilistic Answers to Queries
Dan Suciu
University of Washington
This talk promotes a new query paradigm, in which incomplete and conflicting evidences are collected from a variety of data sources and used to answer complex queries. The system assigns a probability to each tuple in the answer and uses it to rank the results. The first part of the talk contains several examples showing how inexact matches between data values, inexact matches between schema components, lack of sufficient evidence to answer a given query, and violations of constraints across data sources, can all be modeled into a single, unified probabilistic framework. It also illustrates how the same probabilistic framework can be used to reason about information disclosure. The second half of the talk describes a new theoretical result that is a key piece in this new probabilistic framework: how to compute the asymptotic conditional probabilities of queries. The connection to random graphs will also be explained.
- Categories: