notice
Doctoral Thesis Defense: Mahsa Orang
Speaker: Mahsa Orang
Supervisor: Dr. N. Shiri
Supervisory Committee: Drs. J. Bentahar, T. Bui, T. Eavis, D. Rafiei,
F. Haghighat (Chair)
Title: Similarity Search and Analysis Techniques for Uncertain Time Series Data
Date: Monday, July 27, 2015
Time: 10:00am
Place: EV 3.309
ABSTRACT
Emerging applications, such as wireless sensor networks and location-based services, require the ability to analyze large quantities of uncertain time series, where the exact value at each timestamp is unavailable or unknown. Traditional similarity search techniques used for standard time series are not always effective for uncertain time series data analysis. This motivates our work in this dissertation. We investigate new, efficient solution techniques for similarity search and analysis of both uncertain time series models, i.e., PDF-based uncertain time series (having probability density function) and multiset-based uncertain time series (having multiset of observed values) in general, as well as correlation queries in particular. In our research, we first formalize the notion of normalization. This notion is used to introduce the idea of correlation for uncertain time series data. We consider a class of probabilistic, threshold-based correlation queries over such data. We model uncertain correlation as a random variable that is a basis to develop techniques for similarity search and analysis of uncertain time series. Moreover, we propose a few query optimization and query quality improvement techniques. Finally, we demonstrate experimentally how the proposed techniques can improve similarity search in uncertain time series. We believe that our results provide a theoretical baseline for uncertain time series management and analysis tools that will be required to support many existing and emerging applications.