Fuzzy Linear Regression for the Time Series Data which is Fuzzified with SMRGT Method

Our work on regression and classification provides a new contribution to the analysis of time series used in many areas for years. Owing to the fact that convergence could not obtained with the methods used in autocorrelation fixing process faced with time series regression application, success is not met or fall into obligation of changing the models’ degree. Changing the models’ degree may not be desirable in every situation. In our study, recommended for these situations, time series data was fuzzified by using the simple membership function and fuzzy rule generation technique (SMRGT) and to estimate future an equation has created by applying fuzzy least square regression (FLSR) method which is a simple linear regression method to this data. Although SMRGT has success in determining the flow discharge in open channels and can be used confidently for flow discharge modeling in open canals, as well as in pipe flow with some modifications, there is no clue about that this technique is successful in fuzzy linear regression modeling. Therefore, in order to address the luck of such a modeling, a new hybrid model has been described within this study. In conclusion, to demonstrate our methods’ efficiency, classical linear regression for time series data and linear regression for fuzzy time series data were applied to two different data sets, and these two approaches performances were compared by using different measures.


Introduction
While the modeling of some systems that human estimation is effective, it can be encountered with a fuzzy structure.These fuzzy structure parameters can be presented as a fuzzy linear function obtained from fuzzy sets.Fuzzy linear functions are defined by the Zadehs' expansion theory [1].
The fuzzy time series forecasting problem has been studied by several authors at last decade.Song and Chissom introduced time-variant and time-invariant fuzzy time series models, gave properties of them and discussed procedures to develop fuzzy time series models [2].Chen (1996) proposed a method which is more efficient than the one presented in Song and Chissom due to the fact that his method uses simplified arithmetic operations rather than the complicated max-min composition operations presented in Song and Chissom [3].Huarng determined effective lengths of intervals, distribution and average based length, which has not been touched in previous studies [4].Chen (2002) presented a new method, to deal with the forecasting problems based on high-order fuzzy time series and genetic algorithms, which achieves a higher forecasting accuracy rate than the existing methods [5].Tsaur et.al obtained a fuzzy relation matrix represents a time-invariant relation to measure the degrees of fuzziness [6].Singh proposed a new method of fuzzy time series forecasting based on difference parameters for the accuracy in the forecasted values [7].
All these studies are related directly with fuzzy time series.In present work the aim is to introduce the method to fuzzify variables for using fuzzy least squares method.
When a fuzzy linear function is considered for modeling the fuzziness of the system, a fuzzy linear regression analysis is formulated.Owing to the fact that convergence could not obtained with the methods used in autocorrelation fixing process faced with time series regression application, success is not met or fall into obligation of changing the models' degree.Changing the models' degree may not be desirable in every situation.In these situations a new regulation methodology is needed.
Toprak proposed SMRGT for open canal flow modeling [8].Toprak et.al also used this method on determination of losses in water-networks [9,10].Coşkun created the automated fuzzy model generation based on SMRGT [11].Yalaz (Toprak) et.al used SMRGT to fuzzify variables to use least squares method to predict linear parameters [12,13].
In this study a new hybrid model is proposed based on the basic concepts of SMRGT to fuzzify variables and fuzzy regression models approaches to timeseries forecasting [14].In the proposed model, the SMRGT is used to preprocess raw data and to provide the necessary background to apply a fuzzy regression model.Success of SMRGT is only demonstrated in determining the flow discharge in open channels.Because of this reason the new hybrid model has been described with some changes in SMRGT to show success of this technique in fuzzy linear regression modeling.
In order to highlight its appropriateness and effectiveness, our proposed method is applied to two different data sets and their performance is compared with linear regression model for time series.

Time series
Time series, known as the series obtained as a function of time.Time series which is the most important way to make predictions for the future is based on time dependency on successive observations.Incoming observations can be predicted with a function which is adapted to the series.Economic time series is in the first place in time series application subjects.
To perform time series analysis, trend, cyclical variations, seasonal variations and irregular variations which lead to errors in parameter estimation, in other words cause stationary took into account, are necessary to control.A time series is stationary if its mean and variance do not vary systematically over time [15].In nonstationary series using indexes calculated for the effects except time, relevant effects are destroyed.Time series analysis is applicable after this process [16].
In most of the time series, consecutive observations are interconnected.In this case the method that does not use dependency advantage is not suitable.Instead of this method using  models which use the dependence structure of time series in a very efficient manner and known as Box-Jenkins (BJ) forecasting models [17] will be more appropriate.
BJ forecasting models are successful methods used to predict future estimates of univariate time series.Because these models determine the structure of a time series, use the dependency of observations and include the statistical tests in model determination stages, compared to other estimation methods they are the superior models to make short-term prediction [18].BJ forecasting models can be divided into four models; ) which have to be estimated.In practice commonly used  models are the first and second degree  models which are demonstrated as (1) and  (2). models can be written in the form of difference models as follows, Using this equation the -th degree of  model is written as where  is called backward shift operator.Stationary condition in  models is possible in terms of remaining the roots found by equalizing polynomial to zero outside the unit circle.If the roots are outside the unit circle,  () can be used for stationary time series [17]. (1)  autocorrelation and partial autocorrelation functions.The degree of parameters is also determined for the model selected in this way.iii.The model parameters judged to appropriate are estimated.Estimation of the parameters phase is extremely complex and requires the time-consuming process.These processes differ in the type and each model is performed with the use of statistical packages.iv.Using the standard error of the estimated autocorrelations as a measure cannot be clearly demonstrated the importance of autocorrelations differs from zero, calculated in low degree delay [19].Therefore, adequacy of the model is tested with Box-Pierce statistics.If the model is adequate it is used for the estimation.Otherwise, it should be returned to the first step.

Linear regression for time series
When equation ( 1) demonstrates autoregressive () model considered for  = 2, … , ; it can be written as; This equation is not defined for  = 1.Because there is no  0 .When  = ( 2 , … ,   )  the matrix form of the equations (4) can be shown as In linear regression method, putting  parameter in the regression equation gives you ability to make predictions about future [20].
Equation ( 5) which is constructed dependent on  () time series model is similar to linear regression equation.However, to be able to apply regression method to this equation, (  ,  −1 ) = 0 assumption which is not coincided in time series, should be implemented [21].Thus, DW test is applied to this model to measure autocorrelation [22,23].If, there are autocorrelations between consecutive errors, Prais-Winsten approach [24] can be used to fix autocorrelation.In this method, autocorrelation between consecutive errors can be fixed gradually [25].Thus,  parameters can be estimated via linear least square regression estimation method.
Lets fix the autocorrelations between consecutive errors via Prais-Winsten method, considering (1) model   =  1  −1 +   ;  = 2, … , ,.Stationary condition is defined as || < 1.For this correction, in the first step, first derivative of the residual sum of squares  should be equalized to zero as follows: When the equation above is solved, estimated value of  1 can be found as . Using with estimated  � 1 in the equation, for  = 2, . ., ; values can be calculated.Thus, correlation coefficient between _th and ( − 1)_th errors can be calculated as using If this value is put to the 3_th step equation of Prais-Winsten method, for  = 3, . ., is acquired.In the second step, using a new equation can be found as   * =  −1 * +   and returned to the first step.In this manner, this cycle should be run till acquire  �   → 0, ( = 1, … , ).
The last equation which has no autocorrelation will provide the regression assumptions stated above and will be ready to be applied regression methods.The equation which has no autocorrelations also can be found for (2), … , () models.After the time series equation gets ready to be applied linear regression with Prais-Winsten method, the least square estimation of  can be obtained as This minimization problem, acquired with the solutions of normal equations in linear regression defined as If  T  is singular, the system is inconsistence and the equation above can not be obtained.In this case, using Tikhonov (Ridge estimator) regularization the problem can be solved with singular value decomposition [26].
However at the end of the Prais-Winsten procedure, fixes autocorrelation gradually, autocorrelation could not removed or fall into obligation of changing the models' degree.In these undesirable situations a new hybrid model is proposed based on the basic concepts of SMRGT to fuzzify variables and fuzzy regression models approaches to time-series forecasting.

Simple membership function and fuzzy rule generation technique (SMRGT)
As mentioned previously, the main question in any given fuzzy system is how construct the membership functions (MFs) and fuzzy rules (FRs), such that the system yields the best results.In this study, therefore, a simple technique is proposed to help those who have difficulties in deciding on the number, the shape, and the logic of the MFs and FRs in any fuzzy system.In Toprak, 2009, a new fuzzy technique was introduced for open canal flow modeling depends only on some key numbers for all MFs of input and output variables.The key numbers were selected according to the MF shape (triangular, trapezoid, etc.) and the defuzzification method (centroid, maximum membership degree, etc.).
The proposed method used the following algorithm with relevant steps, which should be applied for successful results [8]: i. Decide on the independent variables which affect the dependent variable for the event at hand.The independent variables are inputs, whereas the dependent variable(s) is the output of the fuzzy system.ii.Determine the maximum and the minimum values (variation domain) for each variable.iii.Decide on the MF shape (i.e., triangular, trapezoidal, etc.).iv.Decide on the number of the MF for each independent variable (a minimum number of 3MF is required).v. Determine the width and the core of the MFs with their key values for each independent variable.Note that the number of key values will be equal to the number of the MFs for each independent variable (Figure 1).Although this technique is successful in determining the flow discharge in open channels and can be used confidently for flow discharge modeling in open canals, as well as in pipe flow with some modifications, there is no research paper which demonstrates that this technique is successful in fuzzy linear regression modeling.Therefore, in order to address the luck of such a modeling, a new model has been described within this study.

Fuzzy least squares
A clear path to move fuzzy regression model, in line with the statistical regression, is modeling fuzzy regression along the same lines [27].Let began with the standard linear regression model described in equation (5).
In contrast, the fuzzy regression model may take the following form: Conceptually the -th fuzzy response and explanatory variables as shown in the figure 2.

Formulation of the Proposed Model
For a given fuzzy linear regression how can the least squares approach be optimally designed?SMRGT can be used with the well-known fuzzy least squares equation which is shown in equation ( 12) for fuzzification of the variables.With this method, the limitation of autocorrelation fixing process in linear regression process is lifted through investing on the advantages of the fuzzy regression models.
The main difference of the proposed method from the SMRGT is that output is the multiple of inputs in SMRGT, but in our method  parameter is estimated using input and output.Because of this situation the number of MFs (  ) for dependent variable is chosen square of   for independent variable.
Let fuzzify the independent variable  = ( 1 , … ,   )  and dependent variable  = ( 2 , … ,   )  .The problem can be modeled in light of the SMRGT algorithm given in the section 3.1.
The maximum and minimum of the independent and dependent variable   ,   were decided.
For this model the MF shapes adopted were triangles, which are the simplest form of the MF, for each variable.
Because of the difficulty in running the program from hard or removable drives   for each independent variable is limited to five.With this decision   for dependent variable is limited to twenty five.
The width (variation range)   of the variables can be obtained by: The key values of the intermediated MFs,   is determined.For the intermediated MFs one can accept the unit width of the MF as the width of a right angled triangle.Therefore, the unit width, UW, is given as: where   was the number of units.
On the other hand, the neighboring MFs should overlap with each other.Therefore, for each unit of MF it was necessary to have extended unit width, EUW, symmetrically as: For the center MF: For the previous MF: For the next MF: The key values of the first,  1 , and the last,    , MFs corresponded to the centroid of these MFs, which can be calculated as: and Using this rules the key values fuzzy dependent  � =  �  and independent variable These key values are used to create table of fuzzy rules (Table 1).Data of this table are the inputs and outputs of the fuzzy least squares method for fuzzy linear regression model In the current modeling we proposed to use triangular MFs and the centroid method for defuzzification.The procedure of the proposed model is described in figure 3.

Validation approach and comparison measures
To evaluate the performances of regression for time series and SMRGT several measures can be used.The performance measures that we used in our applications are Adjusted  2 (Adj- 2 ),  2 , Mean Absolute Error (), Mean Absolute Percentage Error (), Mean Square Error (), Root Mean Square Error (), Akaike's Information Criteria (AIC) and Correlation Coefficient (rho).

Construction of linear regression model
While regression model is comprised, firstly, time series models should equip with regression assumptions.In time series there is autocorrelation between consecutive errors and this situation can be fixed with some autocorrelation correction methods.In this study, autocorrelation is measured with Darwin Watson Test [22,23], and it is fixed with Prais-Winsten Procedure using STATA [28].In both data sets convergence is satisfied for  = 2 and so, (2) time series model is used.

Construction of SMRGT model
In

Discussion and Conclusion
Owing to the fact that convergence could not obtained with the methods used in autocorrelation fixing process faced with time series regression application, success is not met or fall into obligation of changing the models' degree.Changing the models' degree may not be desirable in every situation.In these situations a new regulation methodology is needed.With this study, it is proposed to use a new method SMRGT to fuzzify the variables for using fuzzy OLS models and the steps should be followed are shown.In this method, the limitation of autocorrelation fixing process in linear regression process is lifted through investing on the advantages of the fuzzy regression models.
Although SMRGT is successful in determining the flow discharge in open channels and can be used confidently for flow discharge modeling in open canals, as well as in pipe flow with some modifications, there is no research paper which demonstrates that this technique is successful in fuzzy linear regression modeling.Therefore, in order to address the luck of such a modeling, the new hybrid model has been described with some changes in SMRGT to show success of this technique in fuzzy linear regression modeling.
This study also include the several performance measurement criteria results independent from methods to measure how effective the proposed technique.When given 8 performance criteria used for the two sets of data are considered, on three of these criteria SMRGT model showed a better performance than the linear regression model (Table 4

Figure 1 .
Figure 1.Parts of fuzzy set vi.These key values are the inputs of the fuzzy model.vii.The fuzzy model is valid for the data distributed between the key values of the first and the last MFs for each variable.viii.Tables of fuzzy rules are prepared (Table1).

Figure 2 .
Figure 2. Relationships between variables in fuzzy linear regression If the equation is rearranged,  � =  � −  �  is obtained.The problem turns into the following form with least squares perspective:

Figure 3 .
Figure 3. Procedure in the proposed model this case, the five MFs, of which the first and the second of them are right angled triangles, consisted of eight units.Because of this reason for independent variables   = 5,   = 8 and for dependent variable   = 25,   = 48.According to our method, for  −1 and  −2 in CPCR data new key values are found as 16.15625, 16.625, 17.25, 17.875, 18.34375.Also for  −1 and  −2 in CPVR data new key values are found as 7.21875, 7.875, 8.75, 9.625, 10.28125.We can also obtain key values for   in a similar way.Now, we can construct the tables (Table2 and 3) contain fuzzy rules for located key values of both data.

Figure 4 .
Figure 4. Construction of triangular membership functions and key values to apply centroid deffuzzyfication method for the CPCR * and CPVR ** data model is called according to the number of past period observation that they contain.Generally,  model of order  denoted by  () and defined  =  1  −1 + ⋯ +    − +  (1)where   ,  −1 ,  −2 , … ,  − are the independent variables and  1 ,  2 , . . .,   are the model parameters.It is assumed white noise process has normal distribution with zero mean and   2 variance.The model contains  + 2 unknown parameters (,   2 Model group is determined by examining the series forming observations.At this stage, it is decided to which model of the group would be appropriate.The first work which has to be done in determining the appropriate model is to determine the stationary.The tools for examining the stationary are autocorrelation and correlograms of functions. model descibed by backward shift operator is   = (1 −  1 )  .| 1 | < 1 condition should be satisfied for the stationary model.(2) model descibed by backward shift operator is   = (1 −  1  −  2  2 )  . 1 +  2 < 1,  2 −  1 < 1 and | 1 | < 1 equations should be satisfied for the stationary model.Selection of appropriate models for time series, is performed by BJ Models.The steps are described below: i. ii.It is decided from the group agreed, which model type would be appropriate for the relevant series.While analysing stationary time series and at forecasting stages one of three types of models (, , ) is used.Model selection is performed with using

Table 1 .
Fuzzy rules for n MFs = 5, p = 2 S. Yalaz, A. Atay / Fuzzy Linear Regression for the Time Series Data which is Fuzzified with SMRGT Method

Table 4 .
Performance Measures For CPCR Data

Table 5 .
Performance measures for CPVR data and Table5).MAE, MAPE, MSE, RMSE and AIC values calculated for SMRGT are close to the values obtained for linear regression.According to this study when working with fuzzy numbers fuzzified with SMRGT it can be said that our method is an alternative method to linear regression model for time series data.