Functional Data Structures for Time Series Summary
IOT is ubiquitous now a days. With time series data pouring in from all possible sources, there is an increasing need to process the data at near real time. However, the transmission, storage and compute cost associated with this data is ever increasing.
Functional Data Structure for Time Series
In this talk, we will look at functional implementation of discrete wavelet transforms implemented using Haskell. We will use Haar wavelets for the the demo. The high and low pass filter definition of a wavelet allows a very simple functional implementation of Haar wavelet transform.
We will look at simple recursive definition for forward and reverse wavelet transforms. The wavelet transform is very simple to implement in O(N) time complexity using recursive definition, and higher order functions.
Wavelet as a Container
Though, these transform functions are good for bulk transform i.e. to convert incoming data to/from wavelet space at once, they are not good for point queries. We will look at how we can partition a wavelet vector in dyadic ranges, and dynamically construct orthonormal basis of the wavelet by directly querying only log N coefficients.
The wavelet container that we constructed can be sparse. i.e. it is possible to prune the coefficients that are not contributing to approximation, and still use the wavelet container.
Sketches are random algorithms to estimate the result with a given accuracy with high probability.
We can use sum-of-square energy function property to prune the coefficients. Finally we use sublinear AMS sketching algorithm to predict the energy, and prune the coefficients efficiently in log N time rather than O(N) time. This allows us to quickly prune the wavelet container by getting rid of large number of coefficients. This allows very compact representation of the time series data that allows point queries.
We show Haskell implementation
- Sketch as a class, and linear properties of Sketch
- Implementation of AMS using hash functions, and probability distributions
- Using AMS with hash bucketing to prune wavelet coefficents
We will look at a historical stock data, and use wavelet transform and sketching to reduce the data size.
Currently being studied at my organization, such functional application is found to be useful in representing data for quick analytics.
Outline/Structure of the Case Study
- Necessity of approximating time series
- Representing discrete wavelet transformation for multivariate signals in functional language (Haskell)
- Applying wavelet transforms for Haskell Lists
- Creating functional sketches
- Using sketches and wavelet coefficients.
- Running query on container.
- Stream algorithms
- Working with random algorithms
- Signal processing in functional language
- Reducing signal size
People interested in applying FP in day-to-day life to solve interesting problems.
Prerequisites for Attendees
No formal knowledge of Haskell is necessary, though through simple List processing we will move to parallel high/low pass filter for time series data for fast processing.
It is assumed that the audience has some idea of FP, immutability.
Some math fundamentals (Linear Algebra, Convolution) will be very useful.