Home Construction and Practice of the Optimal Smooth Semi-Supervised Support Vector Machine
Article Publicly Available

Construction and Practice of the Optimal Smooth Semi-Supervised Support Vector Machine

  • Xiaodan Zhang EMAIL logo , Ang Li and Pan Ran
Published/Copyright: October 25, 2015
Become an author with De Gruyter Brill

Abstract

The standard semi-supervised support vector machine (S3VM) is an unconstrained optimization problem of non-convex and non-smooth, so many smooth methods are applied for smoothing S3VM. In this paper, a new smooth semi-supervised support vector machine (SS3VM) model , which is based on the biquadratic spline function, is proposed. And, a hybrid Genetic Algorithm (GA)/ SS3VM approach is presented to optimize the parameters of the model. The numerical experiments are performed to test the efficiency of the model. Experimental results show that generally our optimal SS3VM model outperforms other optimal SS3VM models mentioned in this paper.

1 Introduction

Support vector machines (SVMs) were introduced by Vapnik[1] in the early 1990s. As an effective method of data mining, SVMs are widely applied in many fields ranging from text categorization[2], image retrieval[3], face recognition[4], intelligent goal-driven[5], information security[6] to credit risk evaluation[7], bankruptcy prediction[8], time series forecasting[9], etc. The standard SVM is based on supervised learning. In this result, a large number of labeled samples are required to ensure the preferable classification accuracy. But manual labelling is often a slow, expensive, and error prone process. By contrast, the semi-supervised support vector machines (S3VMs)[1012] can take advantage of labeled samples and unlabeled samples simultaneously, and then reduce the labelling cost significantly. Hence, S3VM has widely attracted attention of some researchers[1315]. Since the unconstrained optimization problem of S3VM is non-convex and non-smooth, most of the fast algorithms cannot be used to solve the S3VMs problem. With this regard, many famous researchers were involved in the development of smooth S3VM (SS3VM) in recent years[1518]. The SS3VM models, smoothed by Gaussian function[12], polynomial functions[19], and the cubic spline function[20], have been proposed successively. However, the influences of all parameters of SS3VM on the classification results have not been deeply studied. In fact, the effects of parameters on the performance of SS3VM cannot be ignored. Therefore, selecting the appropriate parameters of SS3VM is crucial.

In this paper, we firstly explore the smoothing method and propose a new SS3VM model based on a biquadratic spline function. Secondly, we focus on the optimization of parameters of the SS3VM models. A hybrid approach combined with the Genetic Algorithm (GA)[21] and SS3VM is introduced. Finally, we evaluate the new model through numerical experiments. Four dataset are used to test the efficiency of the new model. Moreover, the effect of the number of labeled samples on the classification accuracy of the new model is analyzed.

2 The SS3VM Model

We consider the two-class classification problem. The dataset consist of m labeled samples xxi,yii=1m and l unlabeled samples x¯x¯ii=1l, where xxi,x¯x¯iRRn,yi1,1.xxi(i=1,2,,m)andx¯x¯i(i=1,2,,l) are represented by an m × n matrix Am×n and Bl×n, respectively. The result of classification for labeled samples is determined by an m × m diagonal matrix D with 1 or –1 along its diagonal. Precisely, a standard unconstrained optimization model of S3VM is given as follows[22]:

minωω,b12ωω22+cee1TΛDDAAωω+ee1b+cee2TΛ(BBωω+ee2b)(1)

Here, ω is the normal vector to the bounding plane, b is a bias value, ei (i = 1, 2) is a column vector of 1’s, with e1Rm, e2Rl. c and c* are the penalty parameters. Ʌ(t) = max(0, 1 –t) is called the hinge loss function and Ʌ(|t|) = max(0, 1 – |t|) is called the symmetric hinge loss function[6]. If u = (u1, u2, · · · , un) ∈ Rn, then Ʌ(u) = (Ʌ(u1), Ʌ(u2), · · · , Ʌ(un))T. Nevertheless, the Ʌ(t) and Ʌ (|t|) are not differentiable. As a result, the objective function in the Model (1) is non-convex and non-smooth to which precludes the applications of many fast optimization methods such as BFGS algorithm[23, 25]. So we modify the Model (1) as below:

minωω,b12ωω22+c2ΛDDAAωω+ee1b22+c2fBBωω+ee2b,r22(2)

Here, f(x, r) is an arbitrary smooth function at origin, which is used to approximate the symmetric hinge loss function and r is the relevant parameter. The 1-norm being substituted by 2-norm avoid the non-smoothness of the hinge loss function Ʌ(x)at +1 and the symmetric hinge loss function at +1 and –1. In next section, a biquadratic spline function is constructed to replace the f(x, r)in Model (2).

3 The SS3VM Model Based on the Biquadratic Spline Function

3.1 Construction of Biquadratic Spline Function

Definition 1

Fork>1,y1>0,m,nZ+,letx0=1k,x1=0,x2=1kbe a set of nodes. The function s(x, k) is defined as follows:

s(x,k)=s0(x,k),1kx<0s1(x,k),0x1kΛx,x>1k(3)

Here s0(x,k) and s1(x,k) are the n-times polynomials. Ifs(x, k) satisfies the following conditions:

  1. s(d)(x0,k)=0,d=2,3,,m;k>0,s(x0,k)=1,s(x0,k)=11k;

  2. s(d)(x2,k)=0,d=2,3,,m;k>0,s(x2,k)=1,s(x2,k)=11k;

  3. s0(d)x10,k=s1d(x1+0,k),d=1,2,,m,s0(x1+0,k)=s1(x10,k)=y1.

Then it is called the n-times spline function with the m-orders smooth condition at origin which approximating Λ(|x|).

Theorem 1

Letk>1andx0=1k,x1=0,x2=1kbe a set of nodes. Then there exists a unique biquadratic spline s(x, k) with the second derivative at origin approximating Λ(|x|). The function must have the following expression:

s(x,k)=18k3x434kx238k+1,1kx1kΛx,x>1k(4)

Proof

Let s(x, k) be a quartic spline function with the second derivative at origin, which satisfies the conditions in Definition 1. We derive the equation for s(x, k) on 1k,1k. Set s(3)(xi, k) = Mi(i = 0, 1, 2). x1k,0, it has s(x, k) = s0(x, k). Since s0(x, k) is a polynomial of degree four on [1k,0],s0(3)(x, y) is a linear function. The expression of s0(3)(x, y) can be given by

s0(3)(x,k)=M0(x1x)h0+M1(xx0)h0=M0kx+M1k(xx0)(5)

The above equation is integrated successively 3 times, we obtain:

s0(x,k)+a12x2+a2x+a3=M0kx44!+M1k4!(xx0)4

where a1, a2, and a3 are constants of the integration. According to condition (a) in Definition 1, we can determine: a1=M02k,a2=M03k21,a3=M08k31.

Similarly, on 0,1k, we have

s1(x,k)+b12x2+b2x+b3=M2kx44!M1k4!(x2x)4(6)

where b1, b2, and b3 are constants of the integration. Based on condition (b) in Definition 1, it follows that b1=M22k,b2=M23k2+1,b3=M28k31.

In this way, we show that s(x, k) is piecewise polynomial of degree 4 with parameters M0, M1, and M2 on 1k,1k. Furthermore, applying condition (c) in Definition 1, we obtain the matrix equation:

32313k2013k2121M0M1M2=020(7)

Because the coefficient matrix of Equation (7) is nonsingular, we can get the unique solution M0 = –3k2, M1 = 0, and M2 = 3k2. Finally, the biquadratic spline function s(x, k) with the second derivative at origin approximating Λ(|x|) is obtained as Equation (4).

Theorem 2

Let Λ(|x|) be the symmetric hinge loss function and s(x, k) be the general biquadratic spline function given in Model (4). Then, for xRand k > 1, we have:

  1. 0 ≤ s(x, k) ≤ Λ(|x|);

  2. 0Λ2xs2(x,k)38k(238k).

Proof

If x1k,+,1k, both the inequalities (a) and (b) are true, since s(x, k) = Λ(|x|). It remains to show that the inequalities are true on x1k,1k.

(a). If x1k,0, then s(x,k)=s0(x,k)=18k3x434kx238k+1. Thus s0(2)(x,k)=32k3x232k0minxs0(x,k)=s0(0,k)=0. Hence, s(x, k) is an increasing function. It follows that

138k=s0(x,k)s0(1k,k)=11k0(8)

In addition, let λ(x,k)=Λxs0(x,k)=x18k3x4+34kx2+38k. Then

λ(2)(x,k)=32k3x2+32k0λ(k,x)0(9)

Therefore, λ(x, k) is also an increasing function on 1k,0.minxλ(x,k)=λ(1k,k)=0 is obtained. So 0 ≤ s0(x, k) < Λ(|x|), when x1k,0.

If x0,1k,s(x, k) = s1(x, k). Similar to the first part, the inequality (a) can also be obtained.

(b). If x1k,0, since λ(x, k) is an increasing, the maximum 38k can be obtained at the point x = 0. Hence,

0Λxs0(x,k)38k.(10)

In addition, Λ(|x|) + s0(x, k) = λ(x, k) + 2s0(x, k). Inequalities (8) and (10) imply that: Λx+s0(x,k)238k.

Thus, 0Λ2(|x|)s02(x,k)38k(238k).

Similarly, if xx0,1k,wehave0Λ2xs2(x,k)38k(238k) as well.

In order to show the approximating accuracy of different smooth functions, the comparison diagram of smooth performance is shown in Figure 1.

Figure 1 Different smooth functions approximate Λ(|x|)
Figure 1

Different smooth functions approximate Λ(|x|)

As shown in Figure 1, the biquadratic spline function is closer to the symmetric hinge loss function than both the Gaussian function and the polynomial function. Though the approximate accuracy of the biquadratic spline function is slightly lower than that of the cubic spline function, the biquadratic spline function has more simple expression.

3.2 The Smooth S3VM Model Based on the Biquadratic Spline Function

f(x) in Model (2) is replaced by the s(x, k) in the Equation (4). We obtain the following biquadratic spline smooth semi-support vector machine model:

minωω,b12ωω22+c2ΛDDAAωω+ee1b22+c2sBBωω+ee2b,k22(11)

where k is the smooth parameter.

4 The Optimal SS3VM Model Based on GA

One of the big problems in SS3VM is the selection of the values of parameters that will allow good performance. But, it is not known beforehand what values are the best for the model. In order to select the suitable parameters of the SS3VM, the GA is applied[21, 2628]. GA is an artificial intelligence procedure based on the theory of natural selection and evolution. Unlike the conventional optimization methods, it has the advantages consisting of parallel search, solving complex problems, and large search space.

The SS3VM training algorithm and GA are combined to optimize the parameters of SS3VM. For ease of notation, we named this hybrid method as GA/SS3VM method. Figure 2 shows the overall procedure of the GA/SS3VM method.

Figure 2 Overall procedure of GA/SS3VM method
Figure 2

Overall procedure of GA/SS3VM method

The detailed steps of GA/SS3VM method are given as below:

  1. Define the string (or chromosome).

    According to the Model (2), the parameters c, c*, and r need to be optimized. So, the individual is defined as x = (c, c*, r). The chromosome of x is encoded as a l-bit string which consists of l1 bits standing for c, l2 bits standing for c*, and l1 bits standing for r. Here, l = (l1+l2 + l3).

  2. Determine the fitness function.

    The fitness of an individual of population is based on the performance of SVM. So, the prediction accuracy of SS3VM is taken as the fitness function F(x).

  3. Initialization.

    1. Define the size of population N, probability of crossover Pc, and probability of mutation Pm.

    2. Generate an initial population consisted of N l-bit strings described in Step 1 randomly: P(L)={xxj(L)=(cj,cj,rj)L,j=1,2,,N}, where L is the number of the current generation population. In initial generation, L = 0.

  4. Decode the jth string to obtain the corresponding individual xxjL=(cj,cj,rj)L.

  5. Apply xj(L) to the SS3VM model to compute the Fitness F (xj (L)).

  6. Evolution;

    1. Find the worst fitness FminL, the best fitness FmaxL, and the corresponding xxminL in the Lth generation population.

    2. Replace xxminLwithxxminL.

  7. Calculate the fitness of the Lth generation population: TL=j=1NFxxjL.

  8. Reproduction.

    1. Compute cumulative probabilities qj=i=1jpi(j=1,2,,N),wherepi=FxxiLTL

    2. Generate N random numbers r1i, i = 1, 2, · · ·, N in [0, 1]. For each r1i, If qj–1 < r1iqj, then select the jth string, otherwise, select the first string such that r1iq1.

  9. Generate offspring population P(L + 1) by performing crossover and mutation.

    1. Crossover: For two parent individuals, a random number r2in [0, 1] is generated. If r2 < Pc, choose a random crossover point and exchange the genetic code of the two parent individuals on this point to obtain two new child individuals.

    2. Mutation: Generate a random number r3 in [0, 1] and select a bit randomly. If r3 < Pm, then operate mutation for the bit.

  10. If the terminal condition is satisfied, output the best individual xxmax(L+1). Otherwise, do Steps 4~10. Terminal condition: The maximum number of generations Lmax is reached or the optimum individual does not improve during successive generations.

5 Data Preparation

5.1 Datasets

In this section, four datasets are used to test the hybrid GA/ SS3VM method and our new SS3VM model. All the datasets are obtained from UCI Machine Learning Library (http://archive.ics.uci.edu/ml/). The datasets are named as Heart, QSAR, Wine, and Wilt, respectively. In Heart dataset, the data includes features of the heart disease patients. Heart disease conditions are divided into two categories: presence and absence. The QSAR dataset is used to develop QSAR (Quantitative Structure Activity Relationships) models and study the relationships between chemical structure and biodegradation of molecules. The purpose is to discriminate molecules being ready biodegradable molecules or not. The Wine dataset is related to the white wine quality. The dataset includes objective data and sensory data. The objective data consists of 11 physicochemical attributes such as fixed acidity, volatile acidity and etc. The sensory data includes the median scores of wine quality from 0 (very bad) to 10 (very excellent) graded by experts. In our experiment, if the quality score is bigger than 5, the wine quality is considered as excellent, otherwise, the wine quality is considered as poor. The Wilt dataset involves detecting diseased trees in Quickbird imagery. There are samples for the “diseased trees” class and many for “other land cover” class. For the above four datasets, each sample has been marked. The detailed information of the above datasets is shown in Table 1.

Table 1

Information of the four datasets

Data setsClassesSamplesAttributes
Heart227013 (age, sex, cp, etc)
QSAR2105642 (nHM, thenCp, etc)
Wine2489812 (fixed acidity, citric acid, etc)
Wilt243395 (GLCM_Pan, Mean_G, etc)

5.2 Data Pre-Processing

5.2.1 Data Reduction

The above datasets, especially QSAR, contain a large number of attributes. If all the ratios are used as inputs of SS3VM, it would result in redundancy and low efficiency. So, the principal component analysis (PCA) is applied to avoid these problems. The PCA[2932] can be used for reducing complexity of input variables and it is intended to have a better interpretation of variables. Let Xi be the original variable and R be a variance-covariance matrix. Then, the principal components can be expressed as follows:

Yi=αiX=αi1X1+αi2X2++αipXp

For the above four data sets, the numbers of their original attributes are 13, 42, 12, and 5, respectively. The number of attributes is integrated into 10, 15, 8, and 4 by PCA. In this result, the useless information redundancy and the computational complexity are reduced.

5.2.2 Data Normalization

In order to avoid the numerical difficulties caused by different numeric ranges of variables, the selected input variable is normalized by xscoled = 2 (x– xmin)/(xmaxxmin)–1. Here, x is the original variable, xmax, xmin is the maximum, the minimum of original variable, respectively.

6 Numerical Experiments

6.1 Experimental Design

In order to test GA/SS3VM method and the new Model (11), the three other smooth SS3VM models are applied in our experiments. For ease of notation, these SS3VM models are named as follows:

  1. The SS3VM smoothed by Gaussian function is named as GSS3VM;

  2. The SS3VM smoothed by a polynomial function is named as PSS3VM;

  3. The SS3VM smoothed by the cubic spline function is named as 3SS3VM;

  4. The SS3VM Model (11) is named as 4SS3VM.

The experiments are divided into three parts as follows:

  1. Compare performance of four SS3VM models optimized by GA/SS3VM method with that of the original ones.

  2. Compare performance of the optimal 4SS3VM model with that of other three optimal SS3VM models.

  3. Evaluate the classification accuracy sensitivity of the optimal 4SS3VM model to the different proportion of labeled samples.

Here, the SS3VM models are solved by BFGS algorithm and the performance of models is evaluated by classification accuracy and CPUtime.

Since the samples of above four datasets are all labeled. For performing semi-supervised experiments, the unlabeled samples are simulated by dropping labels from some labeled samples. Each dataset is randomly separated into two portions: training set and testing set. The training set is used to train the models and select the optimal parameters, and the testing set is applied to evaluate the performance of the models. The ratios of two portions are about 0.7 and 0.3. In the first two part of the experiment, for either training set or testing set of each dataset, the proportion of the labeled samples is set as 1/5, i.e., the 20% samples are chosen as labeled and the rest are processed as unlabeled. And, for each training set, the GA/SS3VM method is used to optimize parameters of SS3VM model. For the four models: GSS3VM, PSS3VM, 3SS3VM, and 4SS3VM, the parameters to be optimized, are (c, c*), (c, c*, n), (c, c*, k), and (c, c*, k) respectively. Here, k is the smooth parameter and n is the degree of the polynomial function. Otherwise, an assumption is applied that, during the optimization process, the labels found for the unlabeled samples are right. So, c = c* is set. The parameters of GA used to optimize these models are shown in Table 2.

Table 2

GA parameters for optimizing models

ParametersValue
Size of population (N)40
Maximum Number of generations (max I)70
Length of chromosome of c (c* = c)5
Length of chromosome of k12
Length of chromosome of n3
Crossover rate (Pc)0.7
Mutation rate (Pm)0.01

In part three of the experiment, the sensitivity of the optimal 4SS3VM model to the proportion of labeled samples is tested. For each the dataset, we change the proportion of labeled samples to observe the volatility of classification accuracy of the optimal 4SS3VM model. Here, we arbitrarily select 20%, 30%, 40%, and 50% as the proportion of the labeled samples.

6.2 Experimental Results and Comparisons

In this section, the results that corresponding to the three parts of the experiments are obtained and analysed. Figure 3 shows the evolution process of the optimal parameters selection on Heart dataset. As it is shown in Figure 3, for four SS3VM models, the stop criterion of GA process is satisfied at about 10th or 15th generation (L = 10 or L = 15), i.e., the optimal parameters of four SS3VM models are obtained at about 10th or 15th generation. The procedures of the parameters selection for QSAR, Wine, and Wilt datasets are similar to that of the Heart dataset. It is emphasized that the selection procedures are distinct for different training sets. Moreover, the process is likely not constant for same training sets because of the stochastic of GA. Table 3 summarizes the optimal parameters for the different SS3VM models on the above four datasets. The training accuracy, testing accuracy, and training time are also shown in Table 3. As it can be observed, comparing with other models, the 4SS3VM model achieves the preferable training accuracy and testing accuracy on four datasets with less CPUtime.

Figure 3 Evolution of the parameters selection. (a), (b), (c), and (d) show the parameters selection procedures of GSS3VM, PSS3VM, 3SS3VM and 4SS3VM respectively (NB: Heart Dataset, labeled proportion 20%)
Figure 3

Evolution of the parameters selection. (a), (b), (c), and (d) show the parameters selection procedures of GSS3VM, PSS3VM, 3SS3VM and 4SS3VM respectively (NB: Heart Dataset, labeled proportion 20%)

Table 3

GA/SVM performance on four datasets

Data setsModelsOptimal

parameters
Training

accuracy/%
Testing

Testing/%
CPUtime

/s
HeartGSS3VMc = c* = 2877.3374.440.1814
FSS3VMc = c* = 7n = 477.3373.332.0603
3SS3VMc = c* = 1 k = 8277.7773.330.1193
4SS3VMc = c* = 7 k = 211378.8874.440.1148
QSARGSS3VMc = c* = 3171.5572.150.4419
FSS3VMc = c* = 31 n = 872.5474.142.9183
3SS3VMc = c* = 30 k = 156471.6967.610.5777
4SS3VMc = c* = 28 k = 209371.9774.440.3584
WineGSS3VMc = c* = 2565.9767.602.4921
FSS3VMc = c* = 5n = 366.8367.605.3202
3SS3VMc = c* = 15 k = 110965.9767.605.3202
4SS3VMc = c* = 14 k = 129465.9767.602.1299
WiltGSS3VMc = c* = 1898.5497.782.6598
FSS3VMc = c* = 18 n = 998.5497.7818.9791
3SS3VMc = c* = 1 k = 10098.5497.783.6831
4SS3VMc = c* = 28 k = 64098.5497.783.5875

In order to compare the classification accuracy between the optimal SS3VM models and the non-optimized SS3VM models. For each model, 20 groups of parameters are selected randomly and Heart dataset is used to train and test these models. Let J be the index of parameters group. Figure 4 shows the comparison result. It can be easily found that the performance of the optimal SS3VM models outperforms that of the models with non-optimized parameters on training set. Though the testing accuracy of the optimal models is not the best, it is good enough. That means the GA/SS3VM method contributes to the improvement of the classification accuracy, which greatly supports our claims. The performance of the optimal SS3VM models on other datasets has the similar results.

Figure 4 Accuracy comparison between the optimal models and others with arbitrary parameters. (a), (b), (c), and (d) show the compare results of GSS3VM, PSS3VM, 3SS3VM and 4SS3VM respectively (NB: Heart Dataset, labeled proportion 20%)
Figure 4

Accuracy comparison between the optimal models and others with arbitrary parameters. (a), (b), (c), and (d) show the compare results of GSS3VM, PSS3VM, 3SS3VM and 4SS3VM respectively (NB: Heart Dataset, labeled proportion 20%)

Table 4 displays the classification accuracy and CPU time of the optimal 4SS3VM under the different labeled proportion. It can be seen that the training and testing accuracy have a little volatility about less than 5%. That means the classification accuracy cannot be remarkably improved when the number of labeled samples increases. It implies that the optimal 4SS3VM model has higher computational efficiency, i.e., a small amount labeled samples can be utilized to guarantee the classification accuracy. Thus, the cost of manual labeling can be cut down greatly.

Table 4

Effect of percentage of labeled samples to the 4SS3VM

Data setsLabeled

Proportion
Training

accuracy
Testing

accuracy
CPUtime

/s
Heart2078.8874.440.1148
3078.8874.440.1862
4081.6672.220.2157
5082.7774.440.3143
QSAR2071.6972.440.3584
3072.2673.290.5167
4074.3971.590.7695
5070.9873.010.7692
Wine2065.9767.602.4299
3066.2766.993.7138
4066.0067.005.3567
5066.0367.6614.7719
Wilt2098.5497.783.5875
3098.5497.787.2430
4098.5497.787.5340
5098.3098.2715.4175

7 Conclusions

In this paper, a biquadratic spline function for smoothing the S3VM is proposed. According to the analysis about approximation accuracy, the biquadratic spline function has a preferable performance. Further, a new approach named GA/SS3VM method which integrating SS3VM and GA is presented. Here, GA is used to optimize parameters of SS3VM models. By the hybrid GA/SS3VM method, the optimal SS3VM model is obtained. The optimal SS3VM model is experimentally evaluated on four real datasets. The results show that the SS3VM models optimized by GA/SS3VM approach achieve higher classification accuracy than other non-optimized SS3VM models. In particular, the optimal SS3VM model based on biquadratic spline function has the desirable classification accuracy and the best computational efficiency. Meanwhile, the classification accuracy of new model is insensitive to the labeled proportion which means that good classification accuracy can be achieved with a small amount labeled samples.

For future work, we intend to apply the kernel function for SS3VM and optimize the kernel function, parameters simultaneously.


Supported by the Fundamental Research Funds for the Central Universities of China (FRF-BR-12-021)


References

[1] Vapnik V. The Nature of Statistical Learning Theory. Springer, 1995.10.1007/978-1-4757-2440-0Search in Google Scholar

[2] Joachims T. Text categorization with support vector machines: Learning with many relevant features, Springer, 1998.10.1007/BFb0026683Search in Google Scholar

[3] Melgani F, Bruzzone L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Transactions on Geoscience and Remote Sensting, 2004, 42(8): 1778–1790.10.1109/IGARSS.2002.1025088Search in Google Scholar

[4] Jonsson K, Kittler J, Li Y P, et al. Support vector machines for face authentication. Image and Vision Computing, 2002, 20(5): 369–375.10.1016/S0262-8856(02)00009-4Search in Google Scholar

[5] Satzger B, Kramer O. Goal distance estimation for automated planning using neural networks and support vector machines. Natural Computing, 2013, 12(1): 87–100.10.1007/s11047-012-9332-ySearch in Google Scholar

[6] Mukkamala S, Janoski G, Sung A. Intrusion detection using neural networks and support vector machines. Proceedings of the 2002 International Joint Conference on Neural Networks, 2002: 1702–1707.10.1109/IJCNN.2002.1007774Search in Google Scholar

[7] Danenas P, Garsva G. Credit risk evaluation using SVM-based classifier. Business Information Systems 2014 Interbational Workshops, 2010: 7–12.10.1007/978-3-642-15402-7_3Search in Google Scholar

[8] Shin K S, Lee T S, Kim H J. An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications, 2005, 28(1): 127–135.10.1016/j.eswa.2004.08.009Search in Google Scholar

[9] Guo Z Q, Wang H Q, Liu Q. Financial time series forecasting using LPP and SVM optimized by PSO. Soft Computing, 2013, 17(5): 805–818.10.1007/s00500-012-0953-ySearch in Google Scholar

[10] Fung G, Mangasarian O L. Semi-supervised support vector machines for unlabeled data classifcation. Optimization Methods and Software, 2001, 15(1): 29–44.10.1080/10556780108805809Search in Google Scholar

[11] Chapelle O, Scholkopf B, Zien A. Semi-Supervised Learning. MIT Press, Cambridge, 2006.10.7551/mitpress/9780262033589.001.0001Search in Google Scholar

[12] Chapelle O, Zien A. Semi-supervised classification by low density separation. 2004.Search in Google Scholar

[13] Chapelle O, Sindhwani V, Keerthi S. Branch and bound for semi-supervised support vector machines. Conference on Neural Information Processing Systems, 2007: 217–240.Search in Google Scholar

[14] Astorino A, Fuduli A. Nonsmooth optimization techniques for semisupervised classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12): 2135–2142.10.1109/TPAMI.2007.1102Search in Google Scholar PubMed

[15] Reddy I S, Shevade S, Murty M N. A fast quasi-Newton method for semi-supervised SVM. Pattern Recognition, 2011, 44(10): 2305–2313.10.1016/j.patcog.2010.09.002Search in Google Scholar

[16] Yang L M, Wang L S. A class of smooth semi-supervised SVM by difference of convex functions programming and algorithm. Knowledge-Based Systems, 2013, 41: 1–7.10.1016/j.knosys.2012.12.004Search in Google Scholar

[17] Lee Y J, Mangasarian O L. SSVM: A smooth support vector machine for classification. Computational Optimization and Applications, 2001, 20(1): 5–22.10.1023/A:1011215321374Search in Google Scholar

[18] Vural V, Fung G, Dy J, et al. Fast semi-supervised SVM classifiers using a priori metric information. Optimization Methods and Software, 2008, 23(4): 521–532.10.1080/10556780802102750Search in Google Scholar

[19] Liu Y Q, Liu S Y, Gu M T. Polynomial smooth semi-supervised support vector machine. Systems Engineering — Theory & Practice, 2009, 29(7): 113–118.Search in Google Scholar

[20] Zhang X D, Ma J G. A general cubic spline smooth semi-support vector machine. Chinese Journal of Engineering, 2015, 37(3): 385–389.Search in Google Scholar

[21] Min S H, Lee J, Han I. Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications, 2006, 31(3): 652–660.10.1016/j.eswa.2005.09.070Search in Google Scholar

[22] Chapelle O, Sindhwani V, Keerthi S S. Optimization techniques for semi-supervised support vector machines. The Journal of Machine Learning Research, 2008, 9: 203–233.Search in Google Scholar

[23] Yuan Y, Huang T. A polynomial smooth support vector machine for classification. Advanced Data Mining and Applications, Springer, 2005.10.1007/11527503_19Search in Google Scholar

[24] Dennis J, John E, Moré J J. Quasi-Newton methods, motivation and theory. SIAM Review, 1977, 19(1): 46–89.10.1137/1019005Search in Google Scholar

[25] Yuan Y X. A modified BFGS algorithm for unconstrained optimization. IMA Journal of Numerical Analysis, 1991, 11(3): 325–332.10.1093/imanum/11.3.325Search in Google Scholar

[26] Huerta E B, Duval B, Hao J K. A hybrid GA/SVM approach for gene selection and classification of microarray data. Applications of Evolutionary Computing, Springer, 2006.10.1007/11732242_4Search in Google Scholar

[27] Zhao X, Huang D, Cheung Y, et al. A novel hybrid GA/SVM system for protein sequences classification. Intelligent Data Engineering and Automated Learning-IDEAL, 2004: 11–16.10.1007/978-3-540-28651-6_2Search in Google Scholar

[28] Adankon M M, Cheriet M. Genetic algorithm-based training for semi-supervised SVM. Neural Computing and Applications, 2010, 19(8): 1197–1206.10.1007/s00521-010-0358-8Search in Google Scholar

[29] Abdi H, Williams L J. Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(4): 433–459.10.1002/wics.101Search in Google Scholar

[30] Manly B F J. Multivariate statistical methods: A primer. 2nd Edition. Chapman & Hall, CRC Press, London, 1986.Search in Google Scholar

[31] Tabachnick B G. Fidell L S. Using multivariate statistics. 3rd Edition. 2001.Search in Google Scholar

[32] Noori R, Kerachian R, Darban A, et al. Assessment of importance of water quality monitoring stations using principal component and factor analyses: A case study of the Karoon River. Journal of Water & Wastewater, 2007, 63(3): 60–69.Search in Google Scholar

Received: 2015-5-26
Accepted: 2015-7-22
Published Online: 2015-10-25

© 2015 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 14.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/JSSI-2015-0398/html
Scroll to top button