Signal Length | 1 Second | 5 Seconds | ||||||||||||||
Signal 1 | 0 | 5.87 | 18.59 | - | 344.80 | 5.37 | 16.6 | - | 0 | 6.00 | 14.4 | 20.85 | 24.98 | 10.54 | 31.64 | 26.97 |
Signal 2 | 0 | 10.70 | - | - | 202.78 | 7.41 | - | - | 0 | 10.16 | 23.02 | - | 17.24 | 4.06 | 20.37 | - |
Topological data analysis encompasses a broad set of techniques that investigate the shape of data. One of the predominant tools in topological data analysis is persistent homology, which is used to create topological summaries of data called persistence diagrams. Persistent homology offers a novel method for signal analysis. Herein, we aid interpretation of the sublevel set persistence diagrams of signals by 1) showing the effect of frequency and instantaneous amplitude on the persistence diagrams for a family of deterministic signals, and 2) providing a general equation for the probability density of persistence diagrams of random signals via a pushforward measure. We also provide a topologically-motivated, efficiently computable statistical descriptor analogous to the power spectral density for signals based on a generalized Bayesian framework for persistence diagrams. This Bayesian descriptor is shown to be competitive with power spectral densities and continuous wavelet transforms at distinguishing signals with different dynamics in a classification problem with autoregressive signals.
Citation: |
Figure 1. Shown above (a) are the sublevel sets $ C_{-0.5} $, $ C_{0} $, $ C_{0.25} $, and $ C_1 $ for a damped cosine $ e^{-2t}\cos(8\pi t) $. (b) shows the persistence diagram of the sublevel set filtration. The points in (b) are colored to match the connected components their birth coordinates correspond to. The transition from $ C_0 $ to $ C_{0.25} $ depicts the Elder rule; notice that in $ C_0 $, there are light blue and purple connected components, which merge together in $ C_{0.25} $. A similar merging happens in the transition from $ C_{0.25} $ to $ C_{0.5} $. Since the purple component has a later birth value, it disappears into the light blue component, which persists until it merges into the green component by the same line of reasoning.
Figure 2. This figure illustrates sources of uncertainty in persistence diagrams. Shown above are signals with additive noise (a) $ \mathcal{N}(0,0.01) $, and (b) $ \mathcal{N}(0,0.1) $ along with their persistence diagrams. The persistence diagram for the true underlying signal is shown in red. Spurious features arise due to noise and additionally, true features also shift around.
Figure 3. Top: We consider three signals. The blue signal (Signal 1) and the red signal (Signal 2) are modeled by $ a_{\beta}(t)\cos(8\pi t) $ where $ a_{\beta}(t) = 5e^{-{\beta}t} $ with $ {\beta} = 1,4 $ in Signals 1 and 2 respectively. The green signal (Signal 3) is then added to each case and the amplitudes are translated to have global minima equal to zero. Bottom: The associated persistence diagrams are plotted using the method described in Section 2.2. We observe that as $ {\beta} $ increases, the high-frequency oscillations are less affected by the low-frequency signal and converge faster towards the uniform shape of the green signal. This leads to a decrease in the variance of the persistence coordinates in the red diagram.
Figure 4. (a) The damped cosine $ e^{-2t}\cos(8\pi t) $ with additive noise $ \mathcal{N}(0,0.01) $ and (b) its persistence diagram. (b) shows an uninformative prior intensity with a single component at $ (1,1) $ with covariance matrix $ 10I $. Using the model from Equation (7) with the prior in (c) and the observed diagram in (b) results in the posterior intensity shown in (d). To account for spurious points, which we suspected to be low persistence in this example, we placed components of $ \lambda_{S} $ at $ (0.5,0.1), (1,0.1),(0.75,0.1) $ and $ (1.75,0.1) $.
Figure 5. This figure demonstrates the effect of greater low frequency power on the persistence diagram of a signal. Figures (a) and (c) show two signals, respectively, which are the result of summing low-frequency and high-frequency oscillators. The power of the low-frequency signal is greater in (a) than in (c). To ensure that persistence diagrams in (b) and (d) lie in $ \mathbb{W} $, the aggregate signals in both (a) and (c) have been translated so that their absolute minima are at zero. Notice in (b) that elements of the persistence diagram show greater spread along the Birth axis than in (d). This results in greater birth variance of the corresponding posterior intensity. Also notice the isolated high-persistence mode in (b), which is not present in (d). These phenomena arise because the low frequency signal scatters the higher frequency peaks along the Amplitude axis.
Figure 6. This plot depicts the relationship between the cardinality of persistence diagrams and the frequency of the dominant oscillation for one second autoregressive signals across various damping factors. For each included frequency and damping factor, we simulated thirty signals (each had a component fixed at zero to give the $ 1/f $ PSD commonly seen in EEG), computed their persistence diagrams, then recorded their average cardinality. We see a strong positive correlation between this average cardinality and the frequency of the dominant oscillation (i.e., PSD Peak Frequency) consistent with the idealized deterministic sinusoid case.
Table 2. Parameter values for autoregressive model determined by fitting to real EEG. Missing values indicate that the optimal AR model order did not include a corresponding frequency component
Signal Length | 1 Second | 5 Seconds | ||||||||||||||
Signal 1 | 0 | 5.87 | 18.59 | - | 344.80 | 5.37 | 16.6 | - | 0 | 6.00 | 14.4 | 20.85 | 24.98 | 10.54 | 31.64 | 26.97 |
Signal 2 | 0 | 10.70 | - | - | 202.78 | 7.41 | - | - | 0 | 10.16 | 23.02 | - | 17.24 | 4.06 | 20.37 | - |
Table 1.
Precisions and recalls for each feature and classifier. Results are reported as mean
Bayesian | PSD | CWT | ||||
Classifier | Precision | Recall | Precision | Recall | Precision | Recall |
LR | ||||||
SVM - Lin. | ||||||
MLP |
[1] | H. Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, S. Chepushtanova, E. Hanson, F. Motta and L. Ziegelmeier, Persistence images: A stable vector representation of persistent homology, The Journal of Machine Learning Research, 18 (2017), 218-252. |
[2] | M. Bandarabadi, A. Dourado, C. A. Teixeira, T. I. Netoff and K. K. Parhi, Seizure prediction with bipolar spectral power features using adaboost and svm classifiers, Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), (2013), 6305–6308. |
[3] | S. Barbarossa and S. Sardellitti, Topological signal processing over simplicial complexes, IEEE Transactions on Signal Processing, 68 (2020), 2992-3007. doi: 10.1109/TSP.2020.2981920. |
[4] | R. J. Barry, A. R. Clarke, S. J. Johnstone, C. A. Magee and J. A. Rushby, EEG differences between eyes-closed and eyes-open resting conditions, Clinical Neurophysiology, 118 (2007), 2765-2773. doi: 10.1016/j.clinph.2007.07.028. |
[5] | J. Berwald and M. Gidea, Critical transitions in a model of a genetic regulatory system, Mathematical Biosciences and Engineering, 11 (2014), 723-740. doi: 10.3934/mbe.2014.11.723. |
[6] | P. Bromiley, Products and convolutions of gaussian probability density functions, Tina-Vision Memo, 3.4 (2003), 13 pp. |
[7] | P. Bubenik, Statistical topological data analysis using persistence landscapes, The Journal of Machine Learning Research, 16 (2015), 77-102. |
[8] | G. Carlsson, Topology and data, Bulletin of the American Mathematical Society, 46 (2009), 255-308. doi: 10.1090/S0273-0979-09-01249-X. |
[9] | G. Carlsson, A. Zomorodian, A. Collins and L. Guibas, Persistence barcodes for shapes, in Symposium on Geometry Processing, (eds. R. Scopigno and D. Zorin), The Eurographics Association, (2004), 124–135. doi: 10.1145/1057432.1057449. |
[10] | D. Cohen-Steiner, H. Edelsbrunner and J. Harer, Stability of persistence diagrams, Discrete & Computational Geometry, 37 (2007), 103-120. doi: 10.1007/s00454-006-1276-5. |
[11] | W. Crawley-Boevey, Decomposition of pointwise finite-dimensional persistence modules, Journal of Algebra and Its Applications, 14 (2015), 1550066. doi: 10.1142/S0219498815500668. |
[12] | H. Edelsbrunner, D. Letscher and A. Zomorodian, Topological persistence and simplification, Discrete & Computational Geometry, 28 (2002), 511-533. doi: 10.1007/s00454-002-2885-2. |
[13] | H. Edelsbrunner and J. Harer, Computational Topology, American Mathematical Society, 2010. doi: 10.1090/mbk/069. |
[14] | B. T. Fasy, F. Lecci, A. Rinaldo, L. Wasserman, S. Balakrishnan, A. Singh and et al., Confidence sets for persistence diagrams, The Annals of Statistics, 42 (2014), 2301-2339. doi: 10.1214/14-AOS1252. |
[15] | P. J. Franaszczuk and K. J. Blinowska, Linear model of brain electrical activity? EEG as a superposition of damped oscillatory modes, Biological Cybernetics, 53 (1985), 19-25. doi: 10.1007/BF00355687. |
[16] | P. J. Franaszczuk, G. K. Bergey, P. J. Durka and H. M. Eisenberg, Time-frequency analysis using the matching pursuit algorithm applied to seizures originating from the mesial temporal lobe, Electroencephalography and Clinical Neurophysiology, 106 (1998), 513-521. doi: 10.1016/S0013-4694(98)00024-8. |
[17] | S. Gholizadeh and W. Zadrozny, A short survey of topological data analysis in time series and systems analysis, (2018). |
[18] | R. Ghrist, Barcodes: The persistent topology of data, Bull. Amer. Math. Soc. (N.S.), 45 (2008), 61-75. doi: 10.1090/S0273-0979-07-01191-3. |
[19] | C. Ieracitano, N. Mammone, A. Bramanti, S. Marino, A. Hussain and F. C. Morabito, A time-frequency based machine learning system for brain states classification via eeg signal processing, in International Joint Conference on Neural Networks (IJCNN), (2019), 1–8. doi: 10.1109/IJCNN.2019.8852240. |
[20] | F. Khasawneh and E. Munch, Exploring Equilibria in Stochastic Delay Differential Equations Using Persistent Homology, 2014. doi: 10.1115/DETC2014-35655. |
[21] | J. F. C. Kingman, Poisson Processes, Oxford Studies in Probability, 3, Oxford Science Publications. The Clarendon Press, Oxford University Press, New York, 1993. |
[22] | S. G. Mallat and {Z hifeng Zhang}, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, 41 (1993), 3397-3415. |
[23] | A. Marchese and V. Maroulas, Signal classification with a point process distance on the space of persistence diagrams, Advances in Data Analysis and Classification, 12 (2018), 657-682. doi: 10.1007/s11634-017-0294-x. |
[24] | V. Maroulas, J. L. Mike and C. Oballe, Nonparametric estimation of probability density functions of random persistence diagrams, Journal of Machine Learning Research, 20 (2019), 1–49. Available from: http://jmlr.org/papers/v20/18-618.html. |
[25] | V. Maroulas, F. Nasrin and C. Oballe, A bayesian framework for persistent homology, SIAM Journal on Mathematics of Data Science, 2 (2020), 48-74. doi: 10.1137/19M1268719. |
[26] | Y. Mileyko, S. Mukherjee and J. Harer, Probability measures on the space of persistence diagrams, Inverse Problems, 27 (2011), 124007. doi: 10.1088/0266-5611/27/12/124007. |
[27] | A. Monod, S. Kalisnik, J. A. Patino-Galindo and L. Crawford, Tropical sufficient statistics for persistent homology, SIAM Journal on Applied Algebra and Geometry, 3 (2019), 337–371. doi: 10.1137/17M1148037. |
[28] | F. Nasrin, C. Oballe, D. Boothe and V. Maroulas, Bayesian topological learning for brain state classification, in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), (2019), 1247–1252. |
[29] | A. V. Oppenheim, J. R. Buck and R. W. Schafer, Discrete-Time Signal Processing, 2^{nd} edition, Prentice-Hall signal processing, Prentice-Hall, Upper Saddle River, NJ, 1999. Available from: https://cds.cern.ch/record/389969. |
[30] | J. A. Perea and J. Harer, Sliding windows and persistence: An application of topological methods to signal analysis, Found. Comput. Math., 15 (2015), 799-838. doi: 10.1007/s10208-014-9206-z. |
[31] | R. Pintelon and J. Schoukens, Time series analysis in the frequency domain, IEEE Transactions on Signal Processing, 47 (1999), 206-210. |
[32] | M. Robinson, Topological Signal Processing, Springer, 2014. doi: 10.1007/978-3-642-36104-3. |
[33] | M. D. Sacchi, T. J. Ulrych and C. J. Walker, Interpolation and extrapolation using a high-resolution discrete fourier transform, IEEE Transactions on Signal Processing, 46 (1998), 31-38. doi: 10.1109/78.651165. |
[34] | N. Sanderson, E. Shugerman, S. Molnar, J. D. Meiss and E. Bradley, Computational topology techniques for characterizing time-series data, in Advances in Intelligent Data Analysis XVI, Springer International Publishing, (2017), 284–296. |
[35] | K. F. Swaiman, S. Ashwal and M. I. Shevell, Swaiman's Pediatric Neurology, Elsevier, 2018. doi: 10.1016/c2013-1-00079-0. |
[36] | T. Shiraishi, T. Le, H. Kashima and M. Yamada, Topological bayesian optimization with persistence diagrams, preprint, arXiv: 1902.09722. |
[37] | B. W. Silverman, Density Estimation for Statistics and Data Analysis, Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1986. |
[38] | P. Skraba, V. de Silva and M. Vejdemo-Johansson, Topological analysis of recurrent systems, in NIPS 2012, 2012. |
[39] | Y. Umeda, Time series classification via topological data analysis, Transactions of The Japanese Society for Artificial Intelligence, 32 (2017), 1-12. doi: 10.1527/tjsai.D-G72. |
[40] | Y. Wang, H. Ombao and M. K. Chung, Topological data analysis of single-trial electroencephalographic signals, Ann. Appl. Stat., 12 (2018), 1506-1534. doi: 10.1214/17-AOAS1119. |
Shown above (a) are the sublevel sets
This figure illustrates sources of uncertainty in persistence diagrams. Shown above are signals with additive noise (a)
Top: We consider three signals. The blue signal (Signal 1) and the red signal (Signal 2) are modeled by
(a) The damped cosine
This figure demonstrates the effect of greater low frequency power on the persistence diagram of a signal. Figures (a) and (c) show two signals, respectively, which are the result of summing low-frequency and high-frequency oscillators. The power of the low-frequency signal is greater in (a) than in (c). To ensure that persistence diagrams in (b) and (d) lie in
This plot depicts the relationship between the cardinality of persistence diagrams and the frequency of the dominant oscillation for one second autoregressive signals across various damping factors. For each included frequency and damping factor, we simulated thirty signals (each had a component fixed at zero to give the
The peak frequency
The average (log) power spectral densities along with examples of signals and persistence diagrams from each class for damping factors of top) 4, and bottom) 32