January  2021, 17(1): 205-220. doi: 10.3934/jimo.2019107

## A bidirectional weighted boundary distance algorithm for time series similarity computation based on optimized sliding window size

 1 School of Automation, Central South University, Changsha, Hunan410083, China 2 School of Computer, Hunan University of Technology, Zhuzhou, Hunan412007, China

* Corresponding author: Zhaohui Tang

Received  October 2018 Revised  April 2019 Published  September 2019

The existing method of determining the size of the time series sliding window by empirical value exists some problems which should be solved urgently, such as when considering a large amount of information and high density of the original measurement data collected from industry equipment, the important information of the data cannot be maximally retained, and the calculation complexity is high. Therefore, by studying the effect of sliding window on time series similarity technology in practical application, an algorithm to determine the initial size of the sliding window is proposed. The upper and lower boundary curves with a higher fitting degree are constructed, and the trend weighting is introduced into the $LB\_Hust$ distance calculation method to reduce the difficulty of mathematical modeling and improve the efficiency of data similarity computation.

Citation: Cheng Peng, Zhaohui Tang, Weihua Gui, Qing Chen, Jing He. A bidirectional weighted boundary distance algorithm for time series similarity computation based on optimized sliding window size. Journal of Industrial & Management Optimization, 2021, 17 (1) : 205-220. doi: 10.3934/jimo.2019107
Distance between three series
The sliding window principle
The normalized state of 6 types time serie
Steplength range
Time series of the three generators
Clustering result comparison
Clustering error rate with different weight coefficients
Precision of five methods on5 data sets
Clustering error rate with different weight coefficients
Precision of five methods on5 data sets
Runtime of five methods on 5 data sets
Dataset attribute
 Type Items Status 1 1-100 Normal 2 101-200 Cyclic 3 201-300 Increasing trend 4 301-400 Decreasing trend 5 401-500 Upward shift 6 501-600 Downward shift
the combination value of ${w_s}$ and ${L_s}$ and the corresponding distance ${D_T}$
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 49.1 - - - - - - - - - - - - - - 3 55.4 - - - - - - - - - - - - - - 4 54.8 60.6 - - - - - - - - - - - - - 5 58.7 61.2 - - - - - - - - - - - - - 6 64.3 65.3 60.0 - - - - - - - - - - - - 7 62.5 63.9 61.3 - - - - - - - - - - - - 8 64.8 65.6 64.0 62.6 - - - - - - - - - - - 9 62.5 61.2 63.5 62.3 - - - - - - - - - - - 10 63.4 63.9 59.1 62.9 60.8 - - - - - - - - - - 11 58 61.9 62.1 58.5 60.1 - - - - - - - - - - 12 61.3 61.2 58.3 61.1 60.9 59.1 - - - - - - - - - 13 60.2 59.9 59.1 49.3 58.2 49.3 - - - - - - - - - 14 49.9 59.6 60.2 60.8 59.1 60.1 61.7 - - - - - - - - 15 55.9 56.2 53.2 48.2 60.2 49.2 58.7 - - - - - - - - 16 60.9 55.2 58.3 58.0 52.1 50.0 56.1 49.0 - - - - - - - 17 49.2 53.1 54.4 55.2 60.8 59.4 51.7 53.3 - - - - - - - 18 53.7 54.4 52.0 52.3 52.8 60.1 59.2 49.2 49.2 - - - - - - 19 58.3 60.8 55.1 50.3 58.7 58.0 58.1 53.0 51.9 - - - - - - 20 50.7 53.3 55.8 50.1 42.3 45.7 50.0 51.2 51.2 46.3 - - - - - 21 49.6 50.2 46.9 46.5 47.2 47.1 47.5 46.0 48.2 43.9 - - - - - 22 51.3 49.9 48.2 46.7 50.0 45.0 40.0 39.2 39.5 40.7 39.1 - - - - 23 39.9 54.2 50.0 49.8 49.0 36.8 36.5 38.1 42.5 43.9 35.9 - - - - 24 46.8 51.9 48.3 45.0 38.2 42.7 39.4 50.2 41.9 38.4 43.3 51.8 - - - 25 51.6 40.0 58.7 43.1 40.0 39.4 35.0 45.3 45.9 41.2 38.1 40.9 - - - 26 47.3 48.6 50.3 39.6 42.6 55.2 42.0 36.1 35.0 42.0 43.8 39.5 47.8 - - 27 47.5 51.3 40.0 41.6 39.5 35.0 50.0 49.2 39.4 38.4 35.6 39.2 49.9 - - 28 45 40.4 38.4 35.0 35.7 46.2 50.6 45.2 39.1 39.6 42.1 48.2 40.0 38.9 - 29 50.1 48.3 40.2 41.6 35.9 36.1 40.3 39.4 50.1 46.3 39.6 35.9 35.9 35.0 - 30 42.2 49.8 45.0 39.2 40.0 38.9 40.4 39.3 37.5 38.6 36.3 36.9 35.0 36.2 35.2
5 Groups Dataset
 Dataset Samples Categories Attributes temperature 148 3 2 pressure 169 4 12 position 327 10 17 concentration 112 6 16 flow rate 236 5 7
Cross-validation results
 Dataset The optimal value Average precision ${w_s}$ ${w_n}$ ${w_p}$ test set training set temperature 8 0.6 0.4 ${\rm{90\% }}$ ${\rm{92\% }}$ pressure 9 0.6 0.4 ${\rm{89\% }}$ ${\rm{91\% }}$ position 8 0.6 0.4 ${\rm{91\% }}$ ${\rm{92\% }}$ concentration 8 0.6 0.4 ${\rm{90\% }}$ ${\rm{91\% }}$ flow rate 8 0.6 0.4 ${\rm{90\% }}$ ${\rm{92\% }}$
