Mathematical Software - Trend Analysis - Tutorial

Trend^[1]

Let us consider a time series x(t). We assume that it can be decomposed to a deterministic η (trend) and a random ε part

x_i = η_i + ε_i.

where i are indices of moments in time in which x(t) function has been measured. Extracting a deterministic part of time series is important for analysis of deterministic sources of a time signal. For each x_i value one can compute the moving average over 2k + 1 points

	i+k
u_i = 1/(2k+1)	∑	x_j
	j=i-k

where x_j are measured in time moments

t_j = t_i + jΔt, j = -k, -k+1, …, k.

A η_j function in the above - defined range of 2k+1 length is assumed to be a polynomial function of l order in t variable

η_j(t) = a₁ + a₂t_j + a₃t²_j + … + a_l+1t^l_j

l and k parameters must fulfill the relation l < 2k+1. To find a set of coefficients {a_m}^l+1_m=1 the least square method is applied. The vector of coefficients is given by

a^* = -(A^TA)^-1A^Tx

where A is a matrix of (2k+1)✗(l+1)

A = -

(

1	-k	(-k)²	…	(-k)^l
1	-k+1	(-k+1)²	…	(-k+1)^l
⋮
1	k	k²	…	k^l

For j = 0 i. e. in the center of averaging range η is estimated by

η^*₀ = a^*₁ = (-(A^TA)^-1A^T)₁x = yx = y_-kx_-k + … + y_kx_k

where x is a column vector. By analogy, η^* value for each i value is given by

η^*₀(i) = yx(i) = y₁x_i-k + y₂x_i-k+1 + … + y_2k+1x_i+k.

Edge effects

Edge effects are present in the case of the first and the last k time points. Thus new estimators are needed. They are given by the expressions

η^*(i) = a^*(k+1)₁ + a^*(k+1)₂(i - k - 1) + … + a^*(k+1)_l+1(i - k - 1)^l for i ≤ k
η^*(i) = a^*(n-k)₁ + a^*(n-k)₂(i + k - n) + … + a^*(n-k)_l+1(i + k - n)^l for i > n - k

where a^*(k+1) coefficients are expanded for the first averaging range (with the center in the k+1 time point) while a^{*(n - k)} coefficients are found for the last averaging range (with the center in the n-k time point).

Variance. Estimation.

To estimate variance of x_j measurements in the range of length 2k+1 one can use the expression

	k
s_x² = 1/(2k-l)	∑	(x_j - η^*_j)²
	j=-k

where η^*_j is given by

η^*_j(t) = a^*₁ + a^*₂t_j + a^*₃t²_j + … + a^*_l+1t^l_j

It leads to the conclusion that at the confidence level of 1-α

|η^*₀(i) - η₀(i)|/(s_x ((A^TA)^-1)₁₁) ≤ t_1-α/2

and thus limits of confidence ranges are as follows

η^±₀(i) = η^*₀(i) ± s_x((A^TA)^-1)₁₁t_1-α/2

where t_1-α/2 is a quantile of the Student's distribution of 2k-l degrees of freedom. The real trend value lies between η⁺(i) and η^-(i).

Limits of confidence ranges for the first and the last averaging range (i. e. for the j = i - k - 1 and j = i + k - n time point) are given by

η^±(i) = η^*(i) ± s_η^*t_1-α/2

where

s²_η^* = s²_x T(A^TA)^-1T^T

and

T = (1, t_j, t²_j, … t^l_j).

Trend - References

^{^} Siegmund Brandt, Data Analysis. Statistical and Computational Methods for Scientists and Engineers, (Ed. 3) Springer Verlag, New York 1999

Machine Learning - OptFinderML