notice
Doctoral Seminar: Aminata Kane
Speaker: Aminata Kane
Supervisor: Dr. N. Shiri
Supervisory Committee:
Drs. J. Bentahar, T. Eavis, B. Jaumard
Title: Efficient and Scalable Techniques for Multivariate Time Series Analysis and Search
Date: Thursday, April 7, 2016
Time: 10:15am
Place: EV 1.162
ABSTRACT
Innovation and advances in technology have led to the growth of data at a phenomenal rate. However, existing techniques proposed for multivariate time series (MTS) data analysis and similarity search problems are inadequate in general for high dimensions, voluminous such data. The goal of this research is to address these problems by development of more efficient and scalable solution techniques. As is customary, the success of such solutions relies on effective dimensionality reduction techniques as a preprocessing steps Feature selection has often been used as a dimensionality reduction technique. It helps identify a subset of dimensions that capture most characteristics of the data. In our work, we present a more effective feature selection technique based on statistics drawn from the Singular Value Decompositions (SVD) of the input MTS data matrix. It also allows reducing the dimensionality of the date, while retaining and ranking its most influential features.
In a related research, we also study the similarity search problems for MTS for the two cases of linear and nonlinear datasets. Our solution approach is to first develop a novel correlation analysis method for streaming linear MTS and then extend it to the non-linear case for data at rest. Our solution approach uses the so-called adaptive randomized dimensionality reduction, a pruning idea, and a threshold based correlation computation. Our proposed pruning method is based on a novel Boolean representation of the times series that better captures the characteristic of the data while provide increased accuracy, as shown by the preliminary results of our empirical evaluation in application domains such as Functional Magnetic Resonance Imaging (FMRI), using real benchmark data.