Startseite An EPC Forecasting Method for Stock Index Based on Integrating Empirical Mode Decomposition, SVM and Cuckoo Search Algorithm
Artikel Öffentlich zugänglich

An EPC Forecasting Method for Stock Index Based on Integrating Empirical Mode Decomposition, SVM and Cuckoo Search Algorithm

  • Xiangfei Li , Zaisheng Zhang und Chao Huang
Veröffentlicht/Copyright: 25. Dezember 2014
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

In order to improve the forecasting accuracy, a hybrid error-correction approach by integrating support vector machine (SVM), empirical mode decomposition (EMD) and the improved cuckoo search algorithm (ICS) was introduced in this study. By using two indexes as examples, the empirical study shows our proposed approach by means of synchronously predict the prediction error which used to correct the preliminary predicted values has better prediction precision than other five competing approaches, furthermore, the improved strategies for cuckoo search algorithm has better performance than other three evolutionary algorithms in parameters selection.

1 Introduction

Stock investment which has long been one of the main activities for making profits has getting more and more attention. For stock market has characteristic of high risks, investors need to conjecture the volatility of the stock indexes in order to make sensible investment decisions. Therefore, effective forecasting models which can greatly reduce the personal decision mistakes through providing more accurate predictions are main concern for the investors and researches. For the inherently nonlinear and non-stationary properties of financial time series, instead of single method, hybrid models are becoming widely used to solve the limitations in financial time series forecasting. For example, Pai and Lin[1] integrate the ARIMA with support vector machine and get a better model for stock price forecasting. Wang[2] proposed a hybrid method provides higher predictability by integrate the GJR with GARCH. Chi et al.[3] use the grey theory and neural networks together and provide a prediction model which conquered the convergence problem caused by large amount input data. Wang et al.[4] gives a hybrid forecasting model which combine the smoothing model (ESM), ARIMA and BPNN. Lu et al.[5] provided an integrated model of NLICA, SVR and PSO which has effective predictive results comparing with 4 comparison models.

Among the forecasting methods, the support vector machine (SVM) is a promising methods which work more effectively than the traditional linear model in time series forecasting area, for it uses a risk function consisting of the empirical error and a regularized term which is derived from the structural risk minimization principle[6]. Because of its outstanding performance, the SVM has now successfully applied in forecasting the financial time-series[7, 8].

However, single SVM can hardly provide satisfying results either, therefore, like the above integration stragetics, there are many mixed methods integrate with the SVM model. For example, Chiu and Chen[7] proposed a dynamic fuzzy model based on the SVM method and choose the genetic algorithm (GA) to adjust the influential degree of each input variable, the experiment results show the model generated better accuracy rate than the traditional forecast methods. Owing to the easily ignoring of the non-stationary nature of stock price series, Hsu, Hsieh, Chih et al.[9] provide a solution by integrated the self-organizing map and SVM together. Lee[10] develope a prediction model based on SVM with a hybrid feature selection method to predict the trend of stock markets, in his study, it showes that SVM outperforms BPN to the problem of stock trend prediction. Kao, Chiu et al.[8] integrate the nonlinear independent component analysis and SVM for forecasting the stock price.

Generally speaking, no matter how perfect the forecasting method is, error is inevitably, the SVM is no exception. However, if we can effectively predict the error when we trying to predict certain time series by the SVM or other methods, it might be get better results by modify the error. The error predicting idea can be found widely used in engineering science, such as river and flood forecasting[11], electrical equipment load forecasting[12], weather forecasting[13, 14]. For example, Madsen and Skotner[11] proposed a new data assimilation procedure based on general filtering update combined with error forecasting, the error forecast model was used to propagate model error at measurement points in the forecast period. the results showed the new model significantly improved the flood forecasting ability. On the purpose of energy saving, Yao et al.[12] proposed a novel forecasting model called “RBFNN” with combined residual error correction to provide accurate air-conditioning load forecasting. The study case indicates the RBFNN with combined residual error correction has a much better forecasting accuracy. The application of error prediction can also be found in economic field. For example, Zhou et al.[15] proposed a novel ARIMA approach on forecasting electricity price with the improvement of predicted forecasting error for the first time, the results showed the presented approach improves the accuracy. Chen, Leung[16] proposed an adaptive forecasting approach which combines the strengths of neural networks and multivariate econometric models to predict the exchange rates. Anderson[17] discussed in detail the specification of a vector error correction forecasting model (called VECM) that is anchored by long-run equilibrium relationships suggested by economic theory. The model was proofed more accurate than the traditional forecasting model.

The key to the error correction forecasting method is how to predict the error value effectively, for the forecasting results after correction may have even bigger deviation if our error predictive value is not accurate. However, owing to the high-frequency, non-stationary and chaotic properties, it is hardly to get satisfying results for the error forecasting, therefore, to solve this problem, a extraction technique which generally utilized to extract features contained in the signals is necessary, for the forecasting model based on these features could have better performance[1820].

The empirical mode decomposition (EMD) which mainly used in extracting information contained in signals, is a signal processing technique primarily applied to image processing or signal processing, however, with its powerful feature extraction capability, it has now been successfully applied to time-series studies[2125]. For example, Zhu, Sun and Li[21] used the EMD technique to decompose the load time series into a series of smooth intrinsic mode functions (IMFs) with different scales and then used the SVM forecast each IMF respectively and obtained the final results by summing up the forecasting results of each IMF together. Yu et al.[22] proposed an EMD based neural network ensemble learning paradigm for world crude oil spot price forecasting, owing to the decomposition work of EMD, each IMF was accurately predicted. The EMD is suitable for the time series in terms of finding fluctuation tendency, which simplifies the task into simple forecasting subtasks[26]. For its abilities of revealing the hidden patterns and trends of time series, we use the EMD technique to process the error time series caused by single SVM for easing the work of next step’s forecasting.

Another problem in SVM forecasting is parameters selecting. The common practice is to optimize the penalty function and kernel function by intelligent optimization algorithms which most commonly includes: genetic algorithm (GA)[2729] and particle swarm optimization (PSO)[30, 31]. With the continuous development of intelligent heuristic algorithm, a new optimization algorithm, cuckoo search (CS) was proposed in 2009 by Yang and Deb[32]. The algorithm is inspired by the reproduction strategy of cuckoos. The main component of CS is using Lévy flights which used as the searching pattern. For the Lévy flights is a random walk that is characterised by a series of instantaneous jumps chosen from a probability density function which has a power law tail, this kind of searching made CS can find all optima in a design space, therefore, it has been widely used in engineering science[3335]. However, because the CS is a new algorithm, there are some defects like insufficiency searching energy, low accuracies that could not overcome completely. On basis of CS algorithm, in this paper we put forward several improved strategics and form a new algorithm called improved-cuckoo-search (ICS) algorithm which used to optimize the SVM parameters selecting. The following study shows that the ICS indeed has better performance than methods of grid search, GA, PSO and single CS.

The main highlights of this paper are:

  1. An error forecasting and correcting method (called EPC for short) for stock forecasting which integrates the EMD, SVM and ICS was proposed. By forecasting the possible error simultaneously and using the predictive error to correct the preliminary results, we got better forecasting results with higher accuracy.

  2. For the error sequence forecasting, to solve the problems of high frequency, non-stationary and chaotic properties, we introduced a feature extraction process, the empirical mode decomposition (EMD), into our study. By using the EMD technique we decomposed the error series into a series of smooth intrinsic mode functions (IMFs) with different scales and then used the SVM forecast each IMF respectively and obtained the final results by summing up them.

  3. For the cuckoo search algorithm, for the defects of lacking search abilities and low accuracy, we proposed several improved strategies. The results show that our improved cuckoo search algorithm (ICS) has better accuracy and less evolution steps than other 4 bench marking methods in selecting parameters for SVM model.

The basic chapter arrangement are as follows. Section 2 gives brief introduction of the SVM and EMD methods, then detail concepts and processes of the improved strategics for CS algorithm are explained. Section 3 presents the research scheme for this research. Section 4 containing the empirical results and robustness evaluation from two sample of SSEC indexes and NASDAQ indexes. Section 5 gives the conclusions.

2 Research methodology

2.1 Support vector machine

For the support vector machine plays a domain role in the forecasting model, in this section, we will have a brief introduce of its principal as well as the process. SVM was proposed by Vapnik[36] at 1986. It has built up statistical learning theory and has got more and more attention because of its outstanding ability in solving nonlinear regression estimation problems. The SVM, based on the structural risk minimization principle, its basic thought is mapped the data Xi into the space F which has high altitudes characteristics and then set linear regression equation. The equation can expressed as:

f(X)=(w,φ(X))+b(1)

Where w is the weight vector, b is bias, φ(X) is the nonlinear mapping of Rm space to F space. The traditional prediction or classification method is to find fF to make sure minimize the structural risk value. The structural risk equation is expressed as:

Rreg=λw2+Rempf=i=1SC(ei)+λw2(2)

Where ∥w2 is the incredible risk which reflects the complexity of the model, Remp[f] is the empirical risk, λ is constant used to balance the complexity and the loss error of the model, C(ei) is the experience losses of the model, S is the sample capacity. For established loss function, this problem can be transformed into the optimal solution of the quadratic programming problem. According to Vapnik[36], the ε-insensitivity loss function can be defined as:

yf(x)ε=yf(x)ε,if yf(x)ε0,if yf(x)ε(3)

Where ε controls regression error range, the smaller the value, the higher of the accuracy, but with generalization ability decreases. Based on this kind of loss function, the empirical risk can be defined as:

Rempε[f]=1Si=1Syf(x)ε(4)

Combine with the Eq.(1)∼Eq.(4), the original problem can be transformed into a functional problem of minimize the linear risk as follows:

minη=12wTw+Ci=1S(ζi+ζi)
s.t.yi(w,φ(Xi))bε+ζi(w,φ(Xi))+byiε+ζiζi,ζi0(5)

Where C = 1/λ, ε is the estimate of the accuracy, ζi, ζi* are slack variables. To facilitate the solution of the problem, we transformed it into the dual problem as follows:

maxμ=12i,j=1S(αiαi)(αjαj)(ϕ(Xi),ϕ(Xj))+i=1Sαi(Yi+ε)i=1Sαi(Yi+ε)s.t.i=1Sαi=i=1Sαi0αiC0αiC(6)

Solve the Eq.(6) and obtain the w and b which taken to Eq.(1) can get the nonlinear function f(X):

f(X)=i=1S(αiαi)(ϕ(Xi),ϕ(X))+b(7)

The kernel function which define the computation of high-dimensional space can be expressed as K(xi,xj) = φ(xi)φ(xj). In this study we use the type of radial basis function Krbf(xi,xj) = exp(−γxixj2). Based on this, the prediction problem can be transformed into a solution of quadratic programming decision function problem. This study will use EMD and ICS algorithm to optimize the parameter of the prediction SVM model.

2.2 Empirical mode decomposition

Empirical mode decomposition (EMD) is a promising nonlinear, non-stationary data processing method proposed by Huang et al.[37, 38]. It considers the real time series as fast oscillations super imposed on slow oscillations. Those oscillations in data are extracted based on the principle of local scale separation, and are approximated by “intrinsic mode functions”. An IMF must satisfy the following two conditions: 1) the number of extrema and zero-crossings are the same, or differ at the most by one; 2) they are symmetric with respect to local zero mean.

A sifting process is designed to extract IMFs level by level. First, the IMF with the highest frequency riding on the lower frequency part of the data is extracted, and then the IMF with the next highest frequency is extracted from the differences between the data and the extracted IMF. The iterations continue until no IFM is contained in the residual. The overall sifting procedure for a time series S(t) is described as follows:

If the extrema number of the original series is more than zero-crossings over two, the decomposition shall begin.

  1. Identify all the maxima and minima of the original series, then using the cubic spline interpolation method estimate the envelope function.

  2. Calculate the average value of the maximum and minimum envelope, expressed as m1(t). Define the difference of S(t) and m1(t) as

    S(t)m1(t)=h1(t)(8)

    Where h1(t) is the lower frequency series that get ride of the high frequency part of S(t).

  3. If h1(t) is still not smooth, the EMD process will continue, repeat the above process, theoretically, until the average envelope value is zero. However, if the average envelope is zero, some physical meanings of the amplitude or frequency modulation might be eliminated, therefore, it is necessary to adjust the standard, let

    SD=t=0Tm(k1)(t)mk(t)2m2(k1)(t)(9)

    Where mk(t) is the average envelope function of the k loops. In this study, we let the SD ranged between 0.1 and 0.2, this standard will relax the requirements of the average envelope properly and help retained the physical meaning of IMF to a certain extent.

  4. According to the above process, we get the first component C1(t) which defined as

    h1(k1)(t)m1k(t)=C1(t)(10)

    Where C1(t) as the first component has the highest frequency, with the original series S(t) minus it can get a slightly smooth series r1(t), for which repeat the above operation can get the second component of C2(t) and r2(t). Repeat it until get rn(t) can not be decomposed again, this is the end of EMD decomposition.

    rn1(t)Cn(t)=rn(t)(11)

    Where rn(t) represents the overall trend of the original sequence S(t), then the original sequence is decomposed into several components and a overall trend.

    S(t)=j=1nCj(t)+rn(t)(12)

    Every IMF components have different vibration frequencies and amplitudes which represent different scales information of the original sequence.

    Actually, there will be a mode mixing problem if the data has intermittency. Mode mixing is defined as a single IMF consisting of signal of widely disparate scales, or a signal of a similar scale residing in different IMF components. To overcome this problem, Wu and Huang[39] proposed the ensemble EMD method. The main procedure of EEMD is to add a white noise series to the targeted data series and then decompose the data with white noise added into IMFs. Then repeat the steps iteratively and obtain the means of corresponding IMFs of the decompositions as the final results.

    Therefore, this paper uses the EEMD (we collectively referred to as EMD) method to decompose the training error and predicting error sequence into several components with different time scale characteristics that convenient for predict the error sequence. When processing the data, in order to get the effective information, the original sequence extremum problem of end point is overcame by using the polynomial fitting method[40]. The main process of EMD-SVM is illustrated in Fig.1.

Figure 1 The process of EMD-SVM
Figure 1

The process of EMD-SVM

2.3 Cuckoo search algorithm

2.3.1 Principle of cuckoo search algorithm

Cuckoo search actually belongs to a kind of random search way caused by their unique brood parasitism behavior on zoology principle. According to the research conclusions of zoologist, some kinds of cuckoos have lazy temperament, they never nesting, hatching, or brooding in breeding season, instead, they adopt a way by brood parasitism to reproduction. That is mean they will look for some host birds which have similar physiological and diets with themselves. Usually these cuckoos will quickly lay their eggs in the nest of host species when the hosts go out for food or something else, then leaving those parents to hatch and nurture for its young[4143].

The cuckoo search algorithm actually is a simulation of the random walk search process that cuckoos looking for suitable host nest for laying eggs. The same way of traveling is very common in other animals’ foraging process, such as the albatross[44], bees[45], fruit fly[46], spider monkey[47], baboon[48], etc. They all followed with the Lévy flight distribution which is the best search strategy when there are several independent searchers and the target is randomly distributed. The general process of cuckoo search simulation is to initialize several bird nests, calculate the fitness value of each nest, and then let the bird update its habitat location followed the Lévy flight way until the global best solution point founded. The cuckoo Lévy flight search pattern is illustrated in Fig.2.

Figure 2 Lévy flight way of cuckoo
Figure 2

Lévy flight way of cuckoo

The Ri means cuckoo’s search radius which is the largest step that cuckoo can update at on time. When the goal host’s nest is within the radius, the cuckoo will fly directly to it in a straight line, in contrast, it will search in Lévy flight way[49]. Its random walk step Lj is drawn from a Lévy distribution.

Le´vyP(Lj)=Ljμ,1<μ3(13)

That is to say, when have new host’s nest xit+1 for, say, a cuckoo i, its travel flight way is performed as

xit+1=xit+αLLe´vy(μ)(14)

Usually the α = 1. The above equation is essentially the stochastic equation for random walk. The product ⊕ means entry wise multiplications which is similar with those used in PSO, but more efficient in exploring the search space as its step length is much longer in the long run.

Here the steps essentially form a random walk process obey power law step-length distribution with a heavy tail[50]. Some of the new solutions should be generated by Lévy walk around the best solution obtained so far, this will speed up the local search. However, a substantial fraction of the new solutions should be generated by far field randomization and whose locations should be far enough from the current best solution, will make sure the system will not be trapped in a local optimum.

2.3.2 Improvement of cuckoo search algorithm

The cuckoo search algorithm proposed by Yang and Deb[32] based on three ideal situations: firstly, each cuckoo only can lay one egg at a time and dump its egg in randomly chosen nest; secondly, the best nests with high quality of eggs will carry over to the next generations; thirdly, the number of available host nests is fixed. In this assumption, the cuckoo has characteristics of succinct and easy to realize, while, it also leads some defects that like other evolutionary algorithm, for lack search ability and low accuracy. Based on this, this paper proposed several improved solutions according to the cuckoo’s natural habitat, we called it the improved cuckoo search (ICS).

Similar to the chromosome in genetic algorithm and particle location in particle swarm algorithm, we use nest location as the data point in ICS. The basic information is illustrated in Fig.3.

  1. The figure contains the whole habitat where there are totally M nests and n birds. Each nest location is a multi-dimensional vector and has a fitness value. For a m dimensional problem, say, for cuckoo i, Nesti = [N1, N2, ⋯, Nm], its fitness value is performed as f(Nesti), i ∈ [1,M].

    Figure 3 The migration and brood parasitism of cuckoos
    Figure 3

    The migration and brood parasitism of cuckoos

  2. The cuckoo only lay eggs in a certain range of space which measured by the radius, as shown in the figure, the radius is Ri. Actually the cuckoo would not lay only one egg every time in breeding period, in fact it will lay eggs randomly in its spawn space and the number are proportional to the radius and satisfy the following relation[51]:

    Ci(Ri)=ϕ×(Cieggs/i=1nCieggs)×VarmaxVarmin(15)

    For cuckoo i, Ci(Ri) means its searching radius, Cieggs means the eggs number it lay at one time, n is the total number of all cuckoos, Varmax − Varmin means the interpolation between the max step and min step that determines the search range and accuracy.

  3. Use the individual C1 and C2 as examples, each generation, the cuckoo will search for goal nest followed with the Lévy distribution, and its searching step Lj is random variable. On the one hand, the random searching step will not be easily trapped in local optimum, on the other, it will lose accuracy because of sudden big step may leads the fitness value being far away from the optimal solution, therefore, self-adaptive searching step is necessary. According to the relationship between the searching radius and egg number that reflecting in Eq.(15), we assumed that the searching step is changing with the laying eggs number. To explain in more detail, suppose that cuckoos searching step take random value between 0 to its searching radius, then make the step length L = αCi(Ri), α ∈ rand [0, 1]. In the beginning of the search, cuckoo will lay more eggs in order to adapt to the hostile environment, while along with the increasingly close to the optimal solution, it will lay less eggs to reduce burden. Therefore, the searching step is self-adaptive which will be shorter with closing to optimal solution.

    Cieggs=round[Cieggs(O)Kfall(Nbest)fi(Nj)fall(Nbest)](16)

    In Eq.(16), the product round will assure the egg number is integer. For cuckoo i, Cieggs(O) means the original number of the eggs, fi(Nj) means the fitness value of the jth nest, fall(Nbest) means the best solution of all nest, K is the adjustment coefficient which controls the egg numbers. By controlling the searching step, accordingly the accuracy will improved. It is noticed that no matter how much cuckoo lays eggs, there is only one can survive in the nest, for once hatched, the chicks will push out the other eggs and enjoy the only tending.

  4. The eggs laid by cuckoos will be discovered by the host birds with a probability P (usually 10%). In this case, the host birds can either throw the egg away or abandon the nest. Undetected eggs have chance to be hatched and become the next generation birds that will searching for better hosts’ nests in their spawning space, in this paper, the goal nest means better fitness values.

  5. When a cuckoo migrates to a new position with the fitness value lower than its last nest, it will be regarded as that the bird is deviating from its goal nest which has better fitness value, therefore, we weed out these kinds of bird in order to avoid the unnecessary calculation. Moreover, for the purpose of increasing the random properties, let the random eggs being hatched into mature birds in the next generation and continue their searching follows the Lévy flight.

Therefore, the improved CS has properties as follows: Firstly, as mentioned in [50], the Lévy flight search will not be the best strategy unless there are several independent searchers, thus, the ICS introduced several cuckoos to search independently in parallel with the purpose of increasing the searching ability and efficiency as well as meeting the requirements of best searching strategics. Secondly, the cuckoo eggs hatched and become the mature birds in next generation, this biological evolution style increase the diversity of cuckoo populations and search randomness. Thirdly, by controlling the egg spawning way, we get self-adaptive searching steps, which assured that the wider searching range in the beginning while higher accuracy at later stage.

The central factors for a SVM model are the penalty function C and kernel function g. Therefore, we put forward to optimize them by using the ICS algorithm method. The process of the step for the optimization is illustrated in Fig.4 and the detailed illustration is provided as follows:

Figure 4 The process of SVM optimized by ICS
Figure 4

The process of SVM optimized by ICS

  1. Data initialize. Generate n bird’s nests and the same number of bird, the location of which also recorded, indicated as Nest(0)=[N1(0),N2(0),,Nn(0)]. Set the largest population Popmax, the maximum number of iterations itermax, the Varmax and Varmin, let the eggi ∈ [2, 5]. For the jth nest, it has M dimensions feature when it refers to M dimensional optimization problems.

  2. Let the cuckoos spawning randomly within its radius, because some of the eggs (usually 10%), which are not similar to the host’s eggs, are detected and killed by the hosts. Each generation, for each cuckoo, generate the probability P obeying uniform distribution, that is Prand[0, 1], if P < 10% means the current nest has less profit values and the egg in which are killed by host birds.

  3. Count the current existing cuckoo bird number nbirds and egg number neggs, calculate the fitness value of their corresponding nest location. If nbirds + neggs > Popmax, then kill the extra birds or eggs which have less fitness value. Record the best nest location Nestbest(0) with best fitness value Fitnessbest(0) = f( Nestbest(0)).

  4. If the Fitnessbest(0) of step 3 has not yet reach the precision requirements, then make the current bird eggs hatched and grown to the adult birds, then let all of the adult birds update their habitats through Lévy flight migration. Then all the birds have their new location of habitats set, as well as the corresponding fitness value set. Record the current best nest location Nestbest(1) with best fitness value Fitnessbest(1).

  5. Compare Fitnessbest(0) with Fitnessbest(1), choose the larger one and use its nest location set to replace the smaller one. Do it like this is let all birds migrate to the better nest set which has better fitness value. Meanwhile, weeding out the birds whose fitness values are lower than the previous generation, for those birds may migrate to worse environments.

  6. Determine whether the best solution in step 5 satisfy the precision requirements, if yes then output the global optimal solution, otherwise back to step 2, repetitive compute until get the satisfying results.

    It is noticed that there is a significant difference the weed out way among step 2, step 3 and step 5. Step 2 only weed out eggs, not only reflects the true state of cuckoo’s breeding procedure in nature, but also to a certain extent increases the random disturbance and searching ability of the ICS; Step 3 weed out both adults and eggs, for the purpose of control the total population; Step 5 only weed out the adults, for only adults can migrate following Lévy flight, moreover, bird eggs will not taken into consideration until the next generation, not eliminate the eggs can retain the diversity of the searching way of next generation and increase the chances of finding the global optimal solution.

3 Research scheme

Owing to the high frequency, non-stationary and chaotic properties of the stock index data, a stock price forecasting model utilizing the original stock index data fails to provide satisfying forecast results. To solve this problem, before constructing a forecasting model, many studies would first utilize an information extraction technique to extract features contained in data, then use these extracted characteristics to construct the forecasting model. The following is the illustration of our proposed EPC method.

According to the previous study, it is inevitably produce certain error which lead to the unsatisfactory results. Thus, we assume that if the error can effectively predicted, then the model can produce higher precision of prediction through feedback the error prediction results, using which to modify the preliminary results. Therefore, this section proposes our synchronization error prediction idea based on the hybrid of SVM, EMD and ICS.

As shown in Fig.5, the procedure of the error-correction forecasting can be expressed as follows:

Figure 5 The procedure of the EPC method
Figure 5

The procedure of the EPC method

  1. Data preprocessing. For a known stock price time series {xt,t = 1,2, ⋯,n}, it needs to reconstruct the data set space in order to satisfy the prerequisite of SVM analysis, that is transform the time series into matrix form and construct sample (Xt,Yt) where Xt = {xtm, xtm + 1, ⋯, xt − 1 and Yt = xt. Set m for the sliding time window size which represents using the first m days trading price to predict the (m + 1)th day’s price. In order to use the existing data effectively, it is necessary to do segmentation first. According to different purposes of each stage, we divide the raw sequence into training dataset A1, preliminary test dataset A2 and final test dataset A3, then respectively reconstruct them based on the above space reconstruction principle.

  2. Preliminary forecast of the original sequence. On condition that the dataset A1 and A2 as the training samples, predict the value P(A3) of dataset A3 based on the SVM method. P(A3) is the preliminary forecast value which shall be corrected by the predictive error in the following steps. In terms of the preliminary forecasting process with SVM model, for the stock indices are subject to many factors of influence, we can not expect to achieve good results by applying single factor indictor, therefore, by reviewing of domain experts and prior research, we choose 4 technical indicators as the input variables. Table 1 shows these technical indictors and formulas.

    Table 1

    Initial input features and their formulars

    Feature nameformulaRefs
    CCI (Commodity Channel Index)MtSMt0.015×Dt[5254]
    RSI (Relative Strength Index)1001001+(i=0n1Upti/n)/(i=0n1Dwti/n)[52, 53]
    CPP (Current Price Position)11+e(CtMAt1,ti/MAt1,ti)×100[55]
    ROC (Price Rate-of-Change)CtCtn×100[56]

    Note: Ct is the closing price at time t, Lt is the low price, Ht is the high price, Mt = (Ht + Lt + Ct)/3, SMt=(i=1nMti+1)/n,Dt=(i=1nMti+1SMt)/n,Upt means upward-price-change and Dwt means downward-price-change, MAt is the moving average of t days.

  3. Predict the error sequence by EMD-SVM. For the error sequence has high frequency, non-stationary and chaotic properties, we decompose the error sequence into several IMFs with different time scales. On the basis of this, choosing different kernel functions and parameters for each IMF and make predictions respectively, finally obtained the predictive results by summing up each IMF’s predictive result. Actually forecasting the error in A3 dataset can be realized in two ways:

    Method a. Using the training errors of A1 and A2 set as the training samples, predict the error values in A3 and get the sequence results Ep(A3)1;

    Method b. Firstly use the data of A1 set as the training samples and get the predictive values of A2 set, then utilize the Ep(A2) calculated by using the real value minus the predict value as the training samples and get the predictive error values Ep(A3)2 in A3 set.

    We called Method a as forecast error and correct the initial predictive result as “Training error prediction & correction (TEPC)”; Method b as “Forecast error prediction & correction (FEPC)”.

  4. Final data prediction. Respectively utilize the predictive error value Ep(A3)1 and Ep(A3)2 of A3 set to correct P(A3) and get the corrected results P*(A3)1 and P*(A3)2.

    One important issue should be considered is determining the parameters in setting up SMV model. It should be noticed that the parameters determination in our research is with the purposed of optimized by the ICS algorithm. The parameter especially including penalty function C and kernel function g which have decisive influence to the prediction accuracy and generalization ability, thus, in order to assure the effectiveness of selected parameters, we use several methods including single cuckoo search (CS), the grid search (GS), genetic algorithm (GA) and particle swarm optimization (PSO) as the comparing groups for our proposed ICS algorithm. All the algorithms are realized on MATLAB software (R2012 version). To control the ICS, we set the raw bird number n = 5, the Popmax = 10, eggi ∈ [2, 5], Varmax = 0.5 and Varmin = 0.

4 Empirical study

4.1 Datasets and performance criteria

To evaluate the performance of the proposed forecasting model, two stock market indexes (SSE Composite Index of China (SSEC) and National Association of Securities Dealers Automate Quotation (NASDAQ)) are used herein. All of the data collected in this study are cash closing indexes. The time period for each closing index is summarized and shown in Fig.6 and Fig.10. There are total of 437 data points for SSEC and 466 data points for NSDAQ. The first 280 data points are utilized as the training samples (A1), the following 100 data points are used as the preliminary testing samples (A2), while the remaining data points are used as the final testing samples (A3). For the sake of identify the 3 dataset mentioned above easily, we use different color to mark them, the green curve reflects training dataset which has 280 data points, the cyan curve reflects preliminary test dataset which has 100 data points and the blue curve reflects the final test dataset which has the remaining data points. The meaning of these three colors stays the same in the flowing process, while we use the red color curve reflects the predict data series.

Figure 6 The daily SSEC closing indexes from 04/01/2011 to 22/10/2012
Figure 6

The daily SSEC closing indexes from 04/01/2011 to 22/10/2012

Figure 7 The preliminary predictive result of SSEC through single SVM
Figure 7

The preliminary predictive result of SSEC through single SVM

Figure 8 The Ep(A3)1 and P*(A3)1 of SSEC through TEPC
Figure 8

The Ep(A3)1 and P*(A3)1 of SSEC through TEPC

Figure 9 The Ep(A3)2 and P*(A3)2 of SSEC through FEPC
Figure 9

The Ep(A3)2 and P*(A3)2 of SSEC through FEPC

Figure 10 The daily NASDAQ closing indexes form 04/01/2011 to 08/11/2012
Figure 10

The daily NASDAQ closing indexes form 04/01/2011 to 08/11/2012

To certify the performance of our proposed error-correction forecasting method, the forecasting results of the proposed model are compared to the BP neural network, the integrated wavelet-network, the single SVM, single ARIMA, single ANFIS, meanwhile, in order to compare the effectiveness of ICS, it also introduced grid search method (GS), some other evolutionary algorithms like genetic algorithm (GA), particle swarm optimization algorithm (PSO) as the comparative methods in the following study.

The metrics of forecasting performance are utilizing the root mean square error (RMSE), the mean absolute percentage error (MAPE) and the mean absolute error (MAE). Table 3 reflects the numerical results of these three metrics which being used to measure the deviation between real data and predict data of SSEC index and NSDAQ index in our empirical study.

Table 2

The parameter selecting results for each IMF through 5 different methods (SSEC)

ErrorCompo-MSE (fitness value) / Evolution stepsBestBest
sequencenentsGSGAPSOCSICSCg
Imf10.00224/-0.0056/5210.04187/3540.00876/6520.00163/2672−132−5
Imf20.00614/-0.0121/5020.05102/4070.00456/5520.00187/2552−122−4
Imf30.00216/-0.0133/5310.06351/4420.00823/6600.00192/2432−122−7
TE(A1,A2)Imf40.00457/-0.0154/4870.03123/3560.01157/6210.00201/2992−112−7
Imf50.00556/-0.0182/4660.02230/4570.01354/5570.00262/2472−112−5
Imf60.00354/-0.0155/4540.01190/4010.00567/5630.00207/1562−92−4
Imf70.00196/-0.0144/3750.03231/3880.09657/4890.00135/1322−102−4
r0.00247/-0.0137/3310.01232/4240.02015/5870.00122/942−92−6
Imf10.00190/-0.0166/4980.03319/3990.00878/5520.00148/2392−72−8
Imf20.00233/-0.0202/5130.02037/4460.00954/5210.00189/2422−72−4
Ep(A2)Imf30.00201/-0.0130/4870.04155/5040.02247/4830.00163/2012−82−6
Imf40.00183/-0.0187/4660.01162/3970.01822/5900.00122/1142−102−4
r0.00126/-0.0140/3710.03101/3620.00650/5320.00105/882−92−4

Table 3

Summary of forecast results of SSEC and NASDAQ

IndexesModelRMSEMAPEMAE
BP-neural network81.24530.569%12.3454
Wavelet-neural network53.24550.343%7.4363
Single SVM73.73060.450%9.7659
SSECSingle ARIMA75.56470.475%10.3131
Single ANFIS77.57800.516%11.2013
EPCTEPC42.76370.261%5.6642
FEPC43.24550.264%5.7546
BP-neural network152.37860.529%15.5248
Wavelet-neural network101.45420.384%11.2415
Single SVM129.56990.463%13.9719
NASDAQSingle ARIMA132.54530.490%14.3580
Single ANFIS135.24860.501%14.6854
EPCTEPC89.45760.336%9.8453
FEPC89.34040.321%9.6338

4.2 SSEC index

According to the basic process in Fig.5, the preliminary forecast result is shown in Fig.7. Compared with the real data, the predicted data can basically reflects the volatility pattern of SSEC, while there are two significant problems: first, the prediction curve has obvious time lag with the real data curve; second, there are big errors at the inflection points. That means through the single SVM model can hardly get satisfying results.

As mentioned in Section 3, there are 2 ways to set up simultaneous error-prediction forecasting model, TEPC and FEPC. For TEPC, it needs to decompose the training error sequence in A1 and A2 data set, as shown in Fig.(a) of Appendix, the green curve means sequences have same time span with A1 dataset, the cyan curve means sequences have same time span with A2 dataset. The training error sequence TE(A1,A2) can finally be decomposed into 7 IMFs and 1 overall trend. Taking the best C and g selected by ISC algorithm (shown in Table 2), set up SVM model and get the predictive value of each IMF, integrate all of them then get the final predictive error Ep(A3)1, as shown left in Fig.8. Use the Ep(A3)1 to correct the preliminary predictive result P(A3) and get the result of P*(A3)1 which illustrated right in Fig.8, the red curve means the predict data which is the final predictive result of SSEC through TEPC method.

From the Fig.8 we can clearly see, through the correction by the predictive error, the two obvious problems mentioned above have been largely improved compared with the preliminary results. The metrics of forecasting performance results are summarized in Table 3.

Similarly, for FEPC, it is also need to decompose the error sequence, while the error sequence Ep(A2) is calculated by the real data minus the predictive data produced by the SVM model using the A1 dataset as training samples, therefore, the Ep(A2) has the same time span of A2 and being marked with cyan. The EMD results is shown in Fig.(b) of Appendix, the Ep(A2) finally decomposed into 5 IMFs and 1 overall trend. The number of components is different from the TE(A1,A2) that mainly because the TE(A1,A2) has 280 data points which is more than Ep(A2)’s 100. Same procedure with the TEPC, select the best C and g for each IMF, get the predictive results of the error sequence Ep(A3)2 and final predictive results P*(A3)2 of SSEC. The parameter selection results are summarized in Table 2 and the final predictive results are shown in Fig.9, from which we can easily distinguish that the predictive results has obviously improved by FEPC. However, compared with the results in Fig.8, it is hardly to distinguish which has higher accuracy, thus, by summarize the metrics of forecasting performance results in Table 3, we can infer that the TEPC is slightly better than FEPC.

From Table 2 we can see that ICS has better fitness value while lesser evolution steps, indicating that the ICS method has great adaptability in SVM model’s parameter selection.

4.3 NASDAQ index

The forecasting procedure of NASDAQ index is similar with that of SSEC index illustrated in Section 4.2. Respectively, Fig.11Fig.13 show the results of the forecasting process of NASDAQ and the metrics of forecasting performance are summarized in Table 3.

Figure 11 The preliminary predictive result of NASDAQ through single SVM
Figure 11

The preliminary predictive result of NASDAQ through single SVM

Figure 12 The Ep(A3)1 and P*(A3)1 of NASDAQ through TEPC
Figure 12

The Ep(A3)1 and P*(A3)1 of NASDAQ through TEPC

Figure 13 The Ep(A3)2 and P*(A3)2 of NASDAQ through FEPC
Figure 13

The Ep(A3)2 and P*(A3)2 of NASDAQ through FEPC

Judging from Fig.11 we can see that the single SVM model failed to produce satisfying predictive results of NASDAQ index as it did in SSEC index. Because the NASDAQ index is in much more mature market which means it may has properties of much higher frequency, more chaotic, etc. Therefore, it seems much more necessarily handing the large error in order to better the predictive results. The following process shows our proposed method can greatly improve the accuracy of the predictive results of NASDAQ index as it did for SSEC.

Firstly, using the TEPC, we get the training error sequence through the single SVM model. The TE(A1,A2) of the NASDAQ index has similar characteristics with SSEC, it can be decomposed into 7 IMFs and 1 overall trend, the pattern of which is illustrated in Fig.(c) of Appendix. Taking each IMF using different penalty function C and kernel function g selected by ICS algorithm (summarized in Table 4) respectively and set SVM model for each IMF and get the final predictive error Ep(A3)1 which being used to correct the preliminary predictive result P(A3) and finally get the result of P*(A3)1. As being illustrated in Fig.12, the red curve means the predict data which is the final predictive result of NASDAQ through TEPC method. Through the correction by the predictive error, the two obvious problems mentioned above have also largely improved compared with the preliminary results. That means the TEPC method is also effective for the predicting of NASDAQ.

Table 4

The parameter selecting results for each IMF through 5 different methods (NSDAQ)

ErrorCompo-MSE (fitness value) / Evolution stepsBestBest
sequencenentsGSGAPSOCSICSCg
Imf10.00463/-0.0083/5520.03233/4300.01032/6410.00155/3312−112−5
Imf20.00314/-0.0096/5210.06217/4210.01223/6520.00253/3202−112−4
Imf30.00626/-0.0101/4930.07324/4390.00975/6330.00312/2752−122−6
TE(A1,A2)Imf40.00433/-0.0166/4870.03235/3840.02087/5890.00330/3642−112−5
Imf50.00514/-0.0172/4830.03088/4730.01723/5660.00289/2982−102−4
Imf60.00402/-0.0138/4770.01022/4120.02389/6020.00257/2012−92−4
Imf70.00188/-0.0169/4310.03731/3920.08323/5110.00188/1982−102−3
r0.00263/-0.0137/3960.01845/4130.06312/6020.00161/1132−92−6
Imf10.00450/-0.0183/5110.03417/3780.01025/6110.00170/3022−72−8
Imf20.00373/-0.0197/5230.02368/4360.01756/5860.00303/3122−72−2
Ep(A2)Imf30.00262/-0.0203/4850.03255/4940.03440/5030.00189/2872−112−6
Imf40.00192/-0.0136/4630.01870/4020.02314/6320.00152/1652−102−4
r0.00138/-0.0095/3630.03361/3770.02001/5710.00143/1022−92−3

Then using the FEPC, decompose the error sequence Ep(A2) and get another series of components, as shown in Fig.(d) of Appendix, the Ep(A2) being decomposed into 5 IMFs and 1 overall trend, because the Ep(A2) has same time span with A2 dataset, it marked with cyan as well. Repeat the above forecasting process for each IMF and finally get the results as shown in Fig.13. The parameter selection results are summarized in Table 4 and the metrics of forecasting performance are summarized in Table 3.

Comparing the results in Fig.12 with Fig.13, it can be roughly judged that the FEPC is somewhat better than the TEPC. For further proofs, through the metrics in Table 3, we can see that the FEPC has smaller deviation, which means higher accuracy. By now, for the SSEC index forecasting, the TEPC is slightly better than the FEPC, while for the NASDAQ index forecasting, the FEPC has better performance. Therefore, it is hard for us to judge the two ways of EPC method, TEPC and FEPC, which one is better.

4.4 Robustness evaluation

To evaluate the robustness of the proposed method, we test all the methods mentioned in Table 3 under different ratios of training sample sizes as well as the testing sample. Because the data points in our sample are in small amount, we make the training and testing sample ratios in 75%, 80%, 85%, 90% and 95%. The forecasting results for the SSEC and NASDAQ under the 6 ratios in different methods are summarized in Table 5. It can be observed that the proposed EPC method outperforms the other bench marking tool under the different ratios in terms of the 3 performance measures. The results show that the EPC method indeed provides better forecast accuracy.

Table 5

Robustness evaluation

SSECNASDAQ
RatioModel
RMSEMAPEMAERMSEMAPEMAE
BP-neural network90.32100.6133%14.0154167.45310.550%18.9915
Wavelet-neural network60.21540.370%8.4663122.45600.464%13.7505
Single SVM80.57400.524%11.9784143.10571.0255%16.0312
75%Single ARIMA81.35400.524%12.6754144.35810.486%16.7740
Single ANFIS83.14610.555%13.8451148.75430.518%17.8752
EPCTEPC50.34580.281%6.424598.64170.351%12.1241
FEPC49.64820.280%6.401098.53780.348%12.0012
BP-neural network87.32410.602%13.8512162.30540.545%18.6420
Wavelet-neural network58.32150.359%8.2460118.10570.386%13.2004
Single SVM77.68440.515%11.8454139.03540.461%15.7566
80%Single ARIMA79.23540.524%12.0488140.27780.475%16.2521
Single ANFIS82.34520.574%13.1864145.54200.506%17.3101
EPCTEPC47.64420.266%6.125895.61210.344%11.7754
FEPC48.12450.277%6.378295.70800.351%12.0138
BP-neural network85.56240.588%12.7821159.37420.537%16.3421
Wavelet-neural network56.98500.362%7.8641115.65470.396%12.0425
Single SVM75.32040.488%10.6027136.85790.492%14.9821
85%Single ARIMA77.90560.504%10.9472137.95400.504%15.3473
Single ANFIS79.78560.577%12.5525140.12750.526%16.0036
EPCTEPC45.64140.278%6.045794.64770.362%11.0210
FEPC45.59700.274%5.964393.72040.357%10.8542
BP-neural network85.37850.581%12.6541157.85120.534%15.9123
Wavelet-neural network56.54320.351%7.6420112.23061.0255%0.04187
Single SVM74.62410.460%10.0113134.34120.494%14.2304
90%Single ARIMA77.56140.492%10.7054134.40210.507%14.7125
Single ANFIS78.45200.533%11.6121137.33421.0255%15.1004
EPCTEPC44.02740.274%5.964291.68010.346%10.3246
FEPC43.87200.270%5.876791.59020.340%10.1206
BP-neural network83.32410.573%12.3579154.24720.534%15.8437
Wavelet-neural network54.52310.349%7.5631107.02470.394%11.6855
Single SVM73.86140.457%9.9216130.21040.475%14.1021
95%Single ARIMA76.42640.488%10.5852133.24230.494%14.6682
Single ANFIS77.84600.529%11.4672136.57840.502%14.9031
EPCTEPC43.82140.272%5.893090.77640.341%10.1129
FEPC43.31450.267%5.787390.54300.338%10.0357

In the robustness evaluation research, the FEPC has better performance than TEPC on the whole, the reason can be explained that, for the SVM model, it is set up by train the training data sample before the forecasting, the training error in A1 and A2 time span is caused by our known samples, while the predictive error in A2 time span, forecasted by the training error in A1, is caused by unknown samples which is much more similar with the predictive error in A3 time span, that means they have analogical regularity and feature.

5 Conclusion

This paper proposed a simultaneous error prediction method (EPC) for stock index forecasting by integrating the empirical mode decomposition (EMD), support vector machine (SVM) and improved cuckoo search algorithm (ICS). In terms of the cuckoo search algorithm, for the defects of lacking search abilities and low accuracy, we proposed several improved strategies, the results show that our improved cuckoo search algorithm has better accuracy and less evolution steps than other 4 bench marking methods. The basic steps are using the SVM model to get a preliminary results first, then use the EMD method to decompose the error sequence which caused in the first step into several IMFs, then set up SVM model for each IMFs and get forecasting results by integrate them, finally use the error predictive to correct the preliminary results. The ICS algorithm play parameters selecting role in the whole research.

For the empirical research, we using one emerging daily stock market index (SSEC) and one mature daily stock market indexes (NASDAQ) as samples. In order to compare the performance of our proposed method, the BP neural network, the wavelet-network, the single SVM, single ARIMA and single ANFIS are used as the reference methods. The empirical research and robustness evaluation results show that our proposed method has a better performance in forecasting the stock index than the 5 reference methods. Moreover, the ICS algorithm has good application prospects in the best parameters selecting, which reflects that it might be used in solving other optimization problems. Future research can aim at combining the EPC idea with other forecasting tools, like neural networks, ARIMA to improve their own forecasting abilities.

References

[1] Pai P F, Lin C S. A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 2005, 33(6): 497–505.10.1016/j.omega.2004.07.024Suche in Google Scholar

[2] Wang Y H. Nonlinear neural network forecasting model for stock index option price: Hybrid GJR-GARCH approach. Expert Systems with Applications 2009, 36(1): 564–570.10.1016/j.eswa.2007.09.056Suche in Google Scholar

[3] Chi S C, Chen H P, Cheng C H. A forecasting approach for stock index future using grey theory and neural networks. International Joint Conference on Neural Networks, IJCNN’99, IEEE, 1999, 6: 3850–3855.10.1109/IJCNN.1999.830769Suche in Google Scholar

[4] Wang J J, Wang J Z, Zhang Z G, et al. Stock index forecasting based on a hybrid model. Omega, 2012, 40(6): 758–766.10.1016/j.omega.2011.07.008Suche in Google Scholar

[5] Lu C J, Wu J Y, Chiu C C, et al. Predicting stock index using an integrated model of NLICA, SVR and PSO. Advances in Neural Networks, ISNN 2011, Springer, 2011: 228–237.10.1007/978-3-642-21111-9_25Suche in Google Scholar

[6] Kim K J. Financial time series forecasting using support vector machines. Neurocomputing, 2003, 55(1): 307–319.10.1016/S0925-2312(03)00372-2Suche in Google Scholar

[7] Chiu D Y, Chen P J. Dynamically exploring internal mechanism of stock market by fuzzy-based support vector machines with high dimension input space and genetic algorithm. Expert Systems with Applications, 2009, 36(2): 1240–1248.10.1016/j.eswa.2007.11.022Suche in Google Scholar

[8] Kao L J, Chiu C C, Lu C J, et al. Integration of nonlinear independent component analysis and support vector regression for stock price forecasting. Neurocomputing, 2013, 99(1): 534–542.10.1016/j.neucom.2012.06.037Suche in Google Scholar

[9] Hsu S H, Hsieh J, Chih T C, et al. A two-stage architecture for stock price forecasting by integrating self-organizing map and support vector regression. Expert Systems with Applications, 2009, 36(4): 7947–7951.10.1016/j.eswa.2008.10.065Suche in Google Scholar

[10] Lee M C. Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Systems with Applications, 2009, 36(8): 10896–10904.10.1016/j.eswa.2009.02.038Suche in Google Scholar

[11] Madsen H, Skotner C. Adaptive state updating in real-time river flow forecasting? A combined filtering and error forecasting procedure. Journal of Hydrology, 2005, 308(1): 302–312.10.1016/j.jhydrol.2004.10.030Suche in Google Scholar

[12] Yao Y, Lian Z, Hou Z, et al. An innovative air-conditioning load forecasting model based on RBF neural network and combined residual error correction. International Journal of Refrigeration, 2006, 29(4): 528–538.10.1016/j.ijrefrig.2005.10.008Suche in Google Scholar

[13] Orrell D, Smith L, Barkmeijer J, et al. Model error in weather forecasting. Nonlinear Processes in Geophysics, 2001, 8(6): 357–371.10.5194/npg-8-357-2001Suche in Google Scholar

[14] Allen M R, Kettleborough J, Stainforth D. Model error in weather and climate forecasting. ECMWF Predictability of Weather and Climate Seminar, European Centre for Medium Range Weather Forecasts, Reading, UK, http://www.ecmwf.int/publications/library/do/references/list/209, 2002.10.1017/CBO9780511617652.016Suche in Google Scholar

[15] Zhou M, Yan Z, Ni Y X, et al. A novel arima approach on electricity price forecasting with the improvement of predicted error. Proceedings of the CSEE, 2004, 12: 013.Suche in Google Scholar

[16] Chen A S, Leung M T. Regression neural network for error correction in foreign exchange forecasting and trading. Computers & Operations Research, 2004, 31(7): 1049–1068.10.1016/S0305-0548(03)00064-9Suche in Google Scholar

[17] Anderson R G, Hoffman D L, Rasche R H. A vector error-correction forecasting model of the US economy. Journal of Macroeconomics, 2002, 24(4): 569–598.10.1016/S0164-0704(02)00067-8Suche in Google Scholar

[18] Kao L J, Chiu C C, Lu C J, et al. A hybrid approach by integrating wavelet-based feature extraction with MARS and SVR for stock index forecasting. Decision Support Systems, 2013, 54(3): 1228–1244.10.1016/j.dss.2012.11.012Suche in Google Scholar

[19] Chang P C, Fan C Y. A hybrid system integrating a Wavelet and TSK fuzzy rules for stock price forecasting. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2008, 38(6): 802–815.10.1109/TSMCC.2008.2001694Suche in Google Scholar

[20] Lu C J, Lee T S, Chiu C C. Financial time series forecasting using independent component analysis and support vector regression. Decision Support Systems, 2009, 47(2): 115–125.10.1016/j.dss.2009.02.001Suche in Google Scholar

[21] Zhu Z, Sun Y, Li H. Hybrid of EMD and SVMs for short-term load forecasting. IEEE International Conference on Control and Automation, 2007: 1044–1047.10.1109/ICCA.2007.4376516Suche in Google Scholar

[22] Yu L A, Lai K K, Wang S Y, et al. Oil price forecasting with an EMD-based multiscale neural network learning paradigm. Computational Science-ICCS 2007, Springer, 2007: 925–932.10.1007/978-3-540-72588-6_148Suche in Google Scholar

[23] Lin A, Shang P, Feng G, et al. Application of empirical mode decomposition combined with k-nearest neighbors approach in financial time series forecasting. Fluctuation and Noise Letters, 2012, 11(2): 1–14.10.1142/S0219477512500186Suche in Google Scholar

[24] Yu L A, Wang S Y, Lai K K. Financial crisis modeling and prediction with a Hilbert-EMD-based SVM approach. Intelligent Data Analysis: Developing New Methodologies Through Pattern Discovery and Recovery, 2009: 286–299.10.4018/978-1-59904-982-3.ch017Suche in Google Scholar

[25] Yu L A, Wang S Y, Lai K K. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm. Energy Economics, 2008, 30(5): 2623–2635.10.1016/j.eneco.2008.05.003Suche in Google Scholar

[26] Lin C S, Chiu S H, Lin T Y. Empirical mode decomposition-based least squares support vector regression for foreign exchange rate forecasting. Economic Modelling, 2012, 29(6): 2583–2590.10.1016/j.econmod.2012.07.018Suche in Google Scholar

[27] Nguyen T, Gordon-Brown L, Wheeler P, et al. GA-SVM based framework for time series forecasting. Fifth International Conference on Natural Computation, ICNC’09, IEEE, 2009, 1: 493–498.10.1109/ICNC.2009.292Suche in Google Scholar

[28] Yuan F C. Parameters optimization using genetic algorithms in support vector regression for sales volume forecasting. Applied Mathematics, 2012, 30(3): 1480–1486.10.4236/am.2012.330207Suche in Google Scholar

[29] Wu C H, Tzeng G H, Goo Y J, et al. A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy. Expert Systems with Applications, 2007, 32(2): 397–408.10.1016/j.eswa.2005.12.008Suche in Google Scholar

[30] Abolhassani A M, Yaghoobi M. Stock price forecasting using PSOSVM. Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on, IEEE, 2010, V3: 352.10.1109/ICACTE.2010.5579738Suche in Google Scholar

[31] Lu C J, Wu J Y, Chiu C C, et al. Predicting stock index using an integrated model of NLICA, SVR and PSO. Advances in Neural Networks-ISNN 2011, Springer, 2011: 228–237.10.1007/978-3-642-21111-9_25Suche in Google Scholar

[32] Yang X S, Deb S. Cuckoo search via Lévy flights. World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, IEEE, 2009: 210–214.10.1109/NABIC.2009.5393690Suche in Google Scholar

[33] Gandomi A H, Yang X S, Alavi A H. Cuckoo search algorithm: A metaheuristic approach to solve structural optimization problems. Engineering with Computers, 2013, 29(1): 17–35.10.1007/s00366-011-0241-ySuche in Google Scholar

[34] Walton S, Hassan O, Morgan K, et al. Modified cuckoo search: A new gradient free optimisation algorithm. Chaos, Solitons & Fractals, 2011, 44(9): 710–718.10.1016/j.chaos.2011.06.004Suche in Google Scholar

[35] Yang X S, Deb S. Engineering optimisation by cuckoo search. International Journal of Mathematical Modelling and Numerical Optimisation1, 2010(4): 330–343.10.1504/IJMMNO.2010.035430Suche in Google Scholar

[36] Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273–297.10.1007/BF00994018Suche in Google Scholar

[37] Huang N E, Shen Z, Long S R, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London, Series A: Mathematical, Physical and Engineering Sciences, 1998, 454(1971): 903–995.10.1098/rspa.1998.0193Suche in Google Scholar

[38] Huang N E, Shen Z, Long S R. A new view of nonlinear water waves: The Hilbert spectrum 1. Annual Review of Fluid Mechanics, 1999, 31(1): 417–457.10.1146/annurev.fluid.31.1.417Suche in Google Scholar

[39] Wu Z, Huang N E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Advances in Adaptive Data Analysis, 2009, 1(1): 1–41.10.1142/S1793536909000047Suche in Google Scholar

[40] Boudraa A, Cexus J, Saidi Z. EMD-based signal noise reduction. International Journal of Signal Processing, 2004, 1(1): 33–37.Suche in Google Scholar

[41] De Ramirez S S, Enquobahrie D, Nyadzi G, et al. Prevalence and correlates of hypertension: A cross-sectional study among rural populations in sub-Saharan Africa. Journal of Human Hypertension, 2010, 24(12): 786–795.10.1038/jhh.2010.14Suche in Google Scholar PubMed

[42] Payne R B, Sorensen M D. The cuckoos. OUP Oxford, 2005, 15.10.1093/oso/9780198502135.003.0003Suche in Google Scholar

[43] Wheatcroft D J. Co-evolution: A behavioral ‘spam filter’ to prevent nest parasitism. Current Biology, 2009, 19(4): R170–R171.10.1016/j.cub.2008.12.034Suche in Google Scholar PubMed

[44] Viswanathan G, Afanasyev V, Buldyrev S V, et al. Lévy flights search patterns of biological organisms. Physica A: Statistical Mechanics and its Applications, 2001, 295(1): 85–88.10.1016/S0378-4371(01)00057-7Suche in Google Scholar

[45] Reynolds A. Cooperative random Lévy flight searches and the flight patterns of honeybees. Physics Letters A, 2006, 354(5): 384–388.10.1016/j.physleta.2006.01.086Suche in Google Scholar

[46] Reynolds A M, Frye M A. Free-flight odor tracking in drosophila is consistent with an optimal intermittent scale-free search. PloS ONE, 2007, 2(4): e354.10.1371/journal.pone.0000354Suche in Google Scholar

[47] Ramos-Fernández G, Mateos J L, Miramontes O, et al. Lévy walk patterns in the foraging movements of spider monkeys (ateles geoffroyi). Behavioral Ecology and Sociobiology, 2004, 55(3): 223–230.10.1007/s00265-003-0700-6Suche in Google Scholar

[48] Schreier A L, Grove M. Ranging patterns of hamadryas baboons: Random walk analyses. Animal Behaviour, 2010, 80(1): 75–87.10.1016/j.anbehav.2010.04.002Suche in Google Scholar

[49] Da Luz M, Buldyrev S V, Havlin S, et al. Improvements in the statistical approach to random Lévy flight searches. Physica A: Statistical Mechanics and its Applications, 2001, 295(1): 89–92.10.1016/S0378-4371(01)00058-9Suche in Google Scholar

[50] Reynolds A, Rhodes C. The Lévy flight paradigm: Random search patterns and mechanisms. Ecology, 2009, 90(4): 877–887.10.1890/08-0153.1Suche in Google Scholar

[51] Rajabioun R. Cuckoo optimization algorithm. Applied Soft Computing, 2011, 11(8): 5508–5518.10.1016/j.asoc.2011.05.008Suche in Google Scholar

[52] Kim K J. Financial time series forecasting using support vector machines. Neurocomputing, 2003, 55(1): 307–319.10.1016/S0925-2312(03)00372-2Suche in Google Scholar

[53] Achelis S B. Technical analysis from A to Z. McGraw Hill New York, 2001.Suche in Google Scholar

[54] Chang J, Jung Y, Yeon K, et al. Technical indicators and analysis methods. Seoul: Jinritamgu Publishing, 1996.Suche in Google Scholar

[55] Lee S H, Lim J S. Kospi time series analysis using neural network with weighted fuzzy membership functions. Agent and Multi-Agent Systems: Technologies and Applications, Springer, 2008: 53–62.10.1007/978-3-540-78582-8_6Suche in Google Scholar

[56] Murphy J J. Technical analysis of the financial markets: A comprehensive guide to trading methods and applications. Prentice Hall Press, 1999.Suche in Google Scholar

Appendix EMD results of the two indexes

Figure 14 EMD results of the two indexes
Figure 14

EMD results of the two indexes

Received: 2014-1-22
Accepted: 2014-4-14
Published Online: 2014-12-25

© 2014 Walter de Gruyter GmbH, Berlin/Boston

Heruntergeladen am 20.11.2025 von https://www.degruyterbrill.com/document/doi/10.1515/JSSI-2014-0481/html
Button zum nach oben scrollen