Non– Stationary Model

Introduction

Corporations and financial institutions as well as researchers and individual investors often use financial time series data such as exchange rates, asset prices, inflation, GDP and other macroeconomic indicator in analysis of stock market, economic forecasts or studies of the data itself (Kitagawa, G., & Akaike, H, 1978). To apply data refining data is the key as it enables one to isolate the data points that are relevant to your stock reports. The data points have covariance, variances and mean that changes with time that is the data are non – stationary. Non – stationary behavior can be cycles, random walks, trends or the combination of the three. Non – stationary data are unpredictable and cannot be forecasted or modeled. The result achieved by using non – stationary time series may indicate relationship between two variables where one does not exist so they may be spurious. In order to receive reliable result and also consistent the non – stationary data need to be transformed into stationary data. In contrast to the non – stationary process that has a variable mean and a variance that does not remain near or returns to a long run mean over time, the stationary process reverts around a constant variance and has a constant long term mean independent of time.

A stationary time series has a constant variance, a constant mean and the variance is independent of time. Without stationery is important for standard econometric theory and is also crucial for one to obtain consistent estimators. Plotting the series against time is the quickest way of telling if a process is stationary. Chances are that the series is stationary if the graph crosses the mean of the sample many times. If the graph does not cross the mean is an indication of persistent trends away from the mean of the series. A variable whose it mean grows around a fixed trend is a trend stationary variable. This enables a classical way of describing an economical time series which grows at a constant rate. A trend series generally tends to evolve around a steady, up warding sloping curve that does not have big swings away from that curve. De trending the series give a stationary process.

The following types of non – stationary processes are the possible candidates:

The examples of non – stationary processes are random walk without a drift, random walk with a drift and deterministic trends.

Random walk without a drift or pure random walk (Y t = Yt-1 + ε t )

Random walk predicts that the value at a time “t” will be equal to the previous period value plus a stochastic component that is a white noise. Stochastic component is identically and independent distributed with mean “0” and variance “σ²”. Random walk can also be named as process or unit with stochastic trend or a process integrated of some order. The process can either move in a negative or positive direction from the mean hence it is a non-mean reverting process. Random walk goes to the infinity as the time goes to the infinity and also evolves over time; this make a random walk cannot be predicted. In a purely random process every element is independent of every other element and each element has an identical distribution.

Random walk with drift (Y t = α + Yt-1 + ε t )

The random walk with a drift predicts that the value at a time “t” will equal the last previous period’s value plus a drift or a constant and a white noise term. Random walk with a drift does not revert to a long run mean and has variance dependent on time. Stochastic process in this case area defined in terms of its behavior at time t where t is an integer value. Since t is a quantity that can be mapped to the integer then there is no problem but essential difference occurs if t is a continuous variable. One considers the value of the time series at a given time to be continuous random variable. Many times series mostly those from volcanology take other kind of values, for example the kinds of discrete events or count values such as those recording number of eruption.

Deterministic trend (Y t = α + βt + ε t )

Deterministic trend predicts that the value at a time “t” will equal the last previous period’s value plus a drift or a constant and a white noise same as random walk with a drift. The value at time “t” in the case of a deterministic trend it is regressed on a time trend (βt) while in the case of a random walk is regressed on a time the last period’s value (Yt-1). Deterministic trend is a non – stationary process that has a mean that grows around a fixed trend, which is independent of time and constant.

Random walk with drift and deterministic trend (Y t = α + Yt-1 + βt + ε t )

Random walk with drift and deterministic trend is a non – stationary process that combines a random walk with a deterministic trend and a drift component. The process specifies the value a time “t” by the drift, a stochastic component, a trend and a last period value.

Trend and difference stationary

A random walk without or with a drift can be transformed to a stationary process by differencing (subtracting Yt-1 from Y t, taking the difference Y t – Yt-1) correspondingly to Y t – Yt-1 = ε t or Y t – Yt-1= α + ε t and then the process becomes difference-stationary. The process loses one observation each time the difference is taken and is the disadvantage of differencing stationary. A non – stationary process with a deterministic trend becomes stationary only after removing de trending or the trend. For example, deterministic trend is transformed into stationary process by subtracting the trend (Cowpertwait, P. S., & Metcalfe, A. V, 2009). When one transfers non – stationary to stationary through de trending no observation is lost. In the case of a random walk with a deterministic trend and a drift, de trending may remove both the drift and deterministic trend but the variance will go to the infinity. Due to this the differencing must also be applied to remove the stochastic trend.

Table 1 Differencing as shown in the table 1 there is difference between the curve of random walk with drift and the difference stationary. The advantage of the difference is that the method removes the trend in the variance, stochastic trend and also differencing is simple to use. The disadvantage is that differencing process losses one observation each time the difference is taken.

Table 2 de trending or removing the trend

After removing the trend, non – stationary process with deterministic trend becomes stationary the process is also known as de trending. The trend βt is subtracted from the non- stationary process α + βt + ε t to be transformed into a stationary process α + ε t. When removing trend to transform a non – stationary process to a stationary one no observation are lost. Removing trend can remove the drift and the deterministic trend, whereas the variance will continue to the infinity. For one to remove the stochastic trend the differencing must also be applied.

In financial model, using non – stationary time series data produces spurious and unreliable result and leads to poor forecasting and understanding. The problem to be solved is to transform the time series data so that it becomes stationary. Differencing is used to transform non – stationary process with a drift or a random walk without or with a drift to stationary process. If the time data being analyzed contain a deterministic trend, the spurious result is avoided by de trending. The non – stationary series may combine a deterministic and stochastic trend at the same time and to avoid obtaining misleading result both the de trending and differencing are applies. De trending removes the deterministic trend and the differencing removes the trend in the variance.

One should always check whether two variables are integrated the same way to obtain meaningful result. Stationary stochastic process is the one where given t1 , . . . ., t l the joint statistical distribution of Xt1, . . . . . . . , X tl is the same as the joint statistical distribution of Xt1+r, . . . . ., X tl + r for all l and r. this means that all moments of all degrees that is variances, expectation, third order and higher of the process anywhere are the same. This also means that the joint distribution of ( X t + r , X) is the same as the ( X t, X s ) and hence cannot depend on s or t but only on the distance between s and t that is s – t. This stationarity is too strict for everyday life a weaker of weaker stationarity or second order is usually used. Second order means that variance of stochastic process and mean do not depend on t.

Global non- stationarityThe random walk is example of a global non – stationarity. The parameter and process of global non-stationarity process are fixed some time ago generally infinity and then the process itself evolves whereas the rules of evolution does not change.

AR, MA and ARMA models

This segment considers some primary probability models that are extensively used for modeling a given time series.

Moving Average models.

Probably the next simplest model is that constructed by the simplest linear combinations of dawdled elements of a totally purely random process, {ǫ t} with Eǫt = 0. A moving mean process, {Xt}, of the order q is then defined as .

………………………………………………………………… equation 11.5

And the abbreviated notation is noted as MA (q). Normally, with a newly redefined procedure it is of interest to note that all its statistical properties. For an MA (q) process, the average is simple to compute, since all the expectation of a given sum is the sum of all the expectations):

……. equation 11.6

This is simply because E(ǫ r) = 0 for any and all values of r. A totally similar argument can be applied for the divergent calculation

……………………………………………………….. equation 11.7

Since var [(ǫ r ) = σ 2] for all values of t

The sample distribution of the autocorrelation can be calculated as [r (τ ) = c (τ ) / c (0)]. If, when one calculates the sample auto covariance, it chops off at a particular lag q, for example it is effectively and adequately zero for lags of [ q + 1 ] or higher, then one can contend the MA(q) model in above equation as the implicit in probability model.

The only most quantity that we should concern ourselves with is only the auto covariance. A cheapest way of seeing what the auto covariance is to take into assumption that the procedures is stationary and multiply in both sides of (figure 11.17) by Xt−τ and take expectations:

This equates to Qt is independent of [X t− τ] and hence [E(ǫ t X t− τ ) = E ǫ t E X t− τ = 0] since the totally purely random procedure here has zero average. Thus (equation11.35) turns into

The formula (equation11.3) is a simplest example of a well-known Yule-Walker equation, more complex versions can be used to obtain a formulae for the auto covariance of more universal AR (p) procedures.

There are several checks and try outs that one can easily make but the absolute comparison of the try outs auto covariance with source values, such as the model of auto covariance given in equation below, is a major first step in model identification.

Also, at this point one can question what one means by the term “effectively zero”. The try out auto covariance is an empirically and statistically computed from the random sample distribution at hand. If more informational data in the time series were to be collected, or another try out stretch used then the sample auto covariance would be greatly different (although for extremely long samples and stationary time series the probability of a huge difference should be much smaller). Hence, try out auto covariance (autocorrelations) are significant random quantities and hence is “effectively zero” render into a continuous statistical hypothesis tests on whether the true auto correlation is equivalent to zero or not.

Finally, while we are still on the subject of sample auto covariance we notice that at the extremes points of the range of τ

……………..equation 11.4

Which also a times happens to be this sample variance and

……… equation 11.5

The horizontally oriented dashed lines in the bottom plot of the above figure [Figure 11.1] are approximately 95% significance bands. Any function which is outside any of these bands (like the ones shown in the figure at specified lags 1 and 2) are viewed as very important, or at least worth further rumination. It is also interesting to realize that although [ρ (τ ) = 0 ] for [ τ > 2] the sample autocorrelations for all lags 10 to 14 inclusive all look quite huge: This is merely an demonstration of the point out that the sample ac f function is a totally random quantity and some extensive care and a large enough experience is required to prevent excessive reading into them.

Autoregressive models.

The other primary development from a totally pure random procedure is the autoregressive model, which, as its name suggests, models a procedure where future values somehow will depend on the recent epoch. Moreover, {X t} follows an AR(p) model which is qualified by

Where {ǫt} is a totally pure random procedure. For the sake of this discussion we shall adopt time-honored history and firstly put into consideration the most simplest AR model: AR(1). Here we simplify notation and define {Xt} by

Since we the ǫt are different and independent

Then with respect to other equations not necessarily indicated above allude that final quantity that we need to worry ourselves with is the only auto-covariance (Schneider, R., & Weil, W, 2008). The simplest way of seeing what the auto covariance actually is, is to assume that the whole process is totally stationary and multiply both sides of (equation 11.17) by [ X t – τ] and take expectations:

We have simulated in total two AR (1) processes each of 200 different observations. The first [ar1] processes is a realization insinuating that [α = 0.9] and the second,[ar 1neg] is an actualization where[ α = −0.9]. Plots of each subsequent actualization and their autocorrelation coefficients in the functions appear. The total exponential decay nature of the auto correlation of [ar1pos] is clear and one can easily note that [ρ (1) = α = 0.9] from the plot section. For [ar1neg] similar considerations do apply except that [ρ(τ )] has an extra [(−1) τ] factor which results in the oscillation

ARMA Models.

It is by a fact that both AR and MA models expresses different behaviors of stochastic interdependence. AR procedures encapsulate a Markov-like quality such that the future events depend on the past, whereas MA processes bring into combination of both elements of randomness from the past events using a non-static moving window. An evident step is to combine all both types of behavior into one ARMA (p, q) model which is found by purely simple concatenations. The procedure [X t] is ARMA(p, q) if

Fitting ARMA models.

By now, we have just identified the probability models implicit in ARMA processes and some other mathematical quantities that are related to them. Of course, once one has enough real data one would like to correctly know the answers to questions such as “Are all these models suitable for my data?”, “Can I accommodate all these models in my data?” “How do I fit these models?”, “Do the first model fit to the data well?”. These kind of questions lead directly to the Box-Jenkins procedure, see Box et al. (1994). Part of this procedure involves analyzing the sample autocorrelation coefficient functions (and are related the function is usually known as partial autocorrelation function) to settle on the order of any AR or MA terms. However, there are several other prospectors such as getting rid of deterministic courses, removing occupants, or/and checking residues. Actually, a full intervention is way beyond the length restrains of this article so we prefer referring the any interested reader to The Chatfield (2003) in the first and preceding instance.

Conclusion

Differencing transforms non – stationary process with a drift or a random walk without or with a drift to stationary process. If the time data being analyzed contain a deterministic trend, the spurious result is avoided by de trending. The non – stationary series may combine a deterministic and stochastic trend at the same time and to avoid obtaining misleading result both the de trending and differencing are applies. De trending removes the deterministic trend and the differencing removes the trend in the variance.

References

Kitagawa, G., & Akaike, H. (1978). A procedure for the modeling of non-stationary time series. Annals of the Institute of Statistical Mathematics, 30(1), 351-363.

Cowpertwait, P. S., & Metcalfe, A. V. (2009). Non-stationary Models. InIntroductory Time Series with R (pp. 137-157). Springer New York.

Schneider, R., & Weil, W. (2008). Non-stationary Models. Stochastic and Integral Geometry, 521-556.