Optimal Parameter Selection in Support Vector Machines
K. Schittkowski,
Journal of Industrial and Management Optimization, Vol. 1, No. 4, 465-476
(2005)
Abstract:
The purpose of the paper is to apply a nonlinear programming algorithm to
compute kernel and related parameters of a support vector machine (SVM) by a
two-level approach. Available training data are split into two groups, one set
for formulating a quadratic SVM with L2-soft margin and another one for
minimizing the generalization error, where the optimal SVM variables are
inserted. Subsequently, the SVM is again solved, but now for the entire set of
training data, and the total generalization error is evaluated for a separate
set of test data. Derivatives of the functions by which the optimization problem
is defined, are evaluated in an analytical way, where an existing Cholesky
decomposition needed for solving the quadratic SVM, is exploited. The approach
is implemented and tested on a couple of standard data sets with up to 4,800
patterns. The results show a significant reduction of the generalization error,
an increase of the margin, and a reduction of the number of support vectors in
all cases where the data sets are sufficiently large. By a second set of test
runs, kernel parameters are assigned to individual features. Redundant
attributes are identified and suitable relative weighting factors are computed.
To download a preprint, click here: svmoptim.pdf