Mathematical Software - Trend Analysis - Tutorial

Trend[1]

Let us consider a time series x(t). We assume that it can be decomposed to a deterministic η (trend) and a random ε part

xi = ηi + εi.

where i are indices of moments in time in which x(t) function has been measured. Extracting a deterministic part of time series is important for analysis of deterministic sources of a time signal. For each xi value one can compute the moving average over 2k + 1 points

i+k
ui = 1/(2k+1) xj
j=i-k

where xj are measured in time moments

tj = ti + jΔt, j = -k, -k+1, …, k.

A ηj function in the above - defined range of 2k+1 length is assumed to be a polynomial function of l order in t variable

ηj(t) = a1 + a2tj + a3t2j + … + al+1tlj

l and k parameters must fulfill the relation l < 2k+1. To find a set of coefficients {am}l+1m=1 the least square method is applied. The vector of coefficients is given by

a* = -(ATA)-1ATx

where A is a matrix of (2k+1)✗(l+1)

A = - (
1 -k (-k)2 (-k)l
1 -k+1 (-k+1)2 (-k+1)l
1 k k2 kl
).

For j = 0 i. e. in the center of averaging range η is estimated by

η*0 = a*1 = (-(ATA)-1AT)1x = yx = y-kx-k + … + ykxk

where x is a column vector. By analogy, η* value for each i value is given by

η*0(i) = yx(i) = y1xi-k + y2xi-k+1 + … + y2k+1xi+k.

Edge effects

Edge effects are present in the case of the first and the last k time points. Thus new estimators are needed. They are given by the expressions

η*(i) = a*(k+1)1 + a*(k+1)2(i - k - 1) + … + a*(k+1)l+1(i - k - 1)l for i ≤ k
η*(i) = a*(n-k)1 + a*(n-k)2(i + k - n) + … + a*(n-k)l+1(i + k - n)l for i > n - k

where a*(k+1) coefficients are expanded for the first averaging range (with the center in the k+1 time point) while a*(n - k) coefficients are found for the last averaging range (with the center in the n-k time point).

Variance. Estimation.

To estimate variance of xj measurements in the range of length 2k+1 one can use the expression

k
sx2 = 1/(2k-l) (xj - η*j)2
j=-k

where η*j is given by

η*j(t) = a*1 + a*2tj + a*3t2j + … + a*l+1tlj

It leads to the conclusion that at the confidence level of 1-α

*0(i) - η0(i)|/(sx ((ATA)-1)11) ≤ t1-α/2

and thus limits of confidence ranges are as follows

η±0(i) = η*0(i) ± sx((ATA)-1)11t1-α/2

where t1-α/2 is a quantile of the Student's distribution of 2k-l degrees of freedom. The real trend value lies between η+(i) and η-(i).

Limits of confidence ranges for the first and the last averaging range (i. e. for the j = i - k - 1 and j = i + k - n time point) are given by

η±(i) = η*(i) ± sη*t1-α/2

where

s2η* = s2x T(ATA)-1TT

and

T = (1, tj, t2j, … tlj).

Trend - References

  1. ^ Siegmund Brandt, Data Analysis. Statistical and Computational Methods for Scientists and Engineers, (Ed. 3) Springer Verlag, New York 1999

Machine Learning - OptFinderML

Package for machine learning - OptFinderML.

Machine Learning on Facebook Ilona Kosinska products on pinterest