Fast Support Vector Machines for Continuous Data

K. Kramer, T. Luo, D. Goldgof, L. Hall, A Remsen

IEEE Transactions on Systems, Man, and Cybernetics-Part B:

Volume 39, Issue 4, pp 898-1001, March 2009

Abstract

Support vector machines can be trained to be very accurate classifiers and have been used in many applications. However, the training and to a lesser extent prediction time of support vector machines on very large data sets can be very long. This paper presents a fast compression method to scale up support vector machines to large data sets. A simple bit reduction method is applied to reduce the cardinality of the data by weighting representative examples. We then develop support vector machines trained on the weighted data. Experiments indicate that the bit reduction support vector machine produces a significant reduction in the time required for both training and prediction with minimum loss in accuracy. It is also shown to, typically, be more accurate than random sampling when the data are not over-compressed.

Data Sets

Experiments on several data sets: banana [ratsch01], phoneme [elena], shuttle[statlog], page [merz], pendigit [merz], letter [merz], SIPPER II plankton images [luoicpr], waveform [merz] and satimage [merz]. They come from several sources ranging in size from 5000 to 58,000 examples and from 2 to 36 attributes. We also ran experiments on the Adult, Forest, and Web data sets to compare with previous work as detailed in subsection \ref{brtb:Comparison}.

Dataset Descriptions
Data Set Num
Examples
Num
Classes
Num
Features
Nominal
Features
Kernel C Gamma
Adult 45222 2 14 8 Linear 1 na
Banana 5300 2 2 0 RBF 1 3.4004
Forest 495141 54 2 44 Linear 1 na
Letter 20000 26 16 0 RBF 6 0.079578
Page 5473 5 10 0 RBF 1 0.08955
PenDigit 10992 10 16 0 RBF 5 0.08955
Phoneme 5404 2 5 0 RBF 13 1.48005
Plankton 8440 5 17 0 RBF 16 0.04096
SatImage 6435 6 36 0 RBF 9 0.14202
Shutle 58000 7 9 0 RBF 16 0.274727
WaveForm 5000 3 40 0 RBF 1 0.008138
Web 36818 2 293 293
(True/False)
Linear 1 na

[ratsch01], G. Ratsch and T. Onoda and K. Muller, Soft margins for AdaBoost, Machine Learning, vol 42, #3, pages 287-320, 2001.

[ELENA], ftp://ftp.dice.ucl.ac.be/pub/neural-nets/elena/database

[statlog], D. Michie and D. J. Spiegelhalter and C. C. Taylor, Machine Learning, Neural and Statistical Classification, url = ftp://ftp.ncc.up.pt/pub/statlog/, 1994

[merz], C. J. Merz and P. M. Murphy, {UCI} repository of machine learning database, http://www.ics.uci.edu/~mlearn/MLRepository.html, year = "1999"

[luoicpr], T. Luo and K. Kramer and D. Goldgof and L.O. Hall and S. Samson and A. Remsen and T. Hopkins, Active learning to recognize multiple types of plankton, 17th conference of the International Association for Pattern Recognition, vol 3, pages 478-481, 2004


Contacts

Dmitry Goldgof [www]
Larry Hall [www]
Andrew Remsen

Alumni
Tong Luo [www]

Graduate Students
Kurt Kramer [www]

Results




Summary of Results by Dataset
Compression Ratio's Bit Reduction Results Random Sampling
Data
Set
Bit Red
Level
Pure
Bit Red
Un Bal
Bit Red
Accuracy
SVM
Accuracy
s=0
Accuracy
BRSVM
SV's
SVM
SV's
s=0
SV's
BRSVM
Accuracy
s=0
Accuracy
BRSVM
SV's
s=0
SV's
BRSVM
adult(105) b=9,s=4 0.7875 0.6339 84.65% 84.56% 84.64% 10580 8131 6306 84.67% 84.62% 8330.4 6076.7
banana(2) b=8,s=0 0.0766 0.0767 90.57% 90.28% 90.28% 1055 174 174 89.70% 89.71% 132.5 132.5
forest(54) b=10,s=4 0.1805 0.0757 75.85% 76.10% 75.89% 57125 8608 3413 75.79% 75.79% 10319.2 4330.4
letter(16) b=9,s=13 0.8868 0.6536 97.53% 97.50% 97.30% 7051 6778 5758 97.22% 96.61% 6491.2 5306.8
page(10) b=9,s=3 0.1800 0.1122 95.44% 95.35% 95.26% 512 282 211 93.93% 93.31% 130.3 88.2
pendigit(16) b=9,s=11 0.9272 0.6210 98.14% 98.17% 98.03% 955 955 898 98.14% 98.01% 924.2 784.2
phoneme(5) b=8,s=1 0.6859 0.6093 90.93% 90.93% 90.39% 1356 1164 1090 89.26% 88.96% 995.2 905.5
plankton(17) b=9,s=8 0.9608 0.7059 89.60% 89.30% 88.60% 2628 2547 1815 89.19% 88.18% 2547.8 1959.3
satimage(36) b=9,s=28 0.9896 0.8329 91.65% 91.65% 90.95% 1603 1603 1458 91.65% 91.41% 1588.5 1407.3
shuttle(9) b=8,s=5 0.0310 0.0189 99.86% 99.84% 99.83% 404 304 302 99.41% 99.11% 139.7 122.1
waveform(40) b=11,s=22 0.4495 0.1845 86.70% 85.80% 85.60% 1887 1001 431 86.47% 85.17% 1006.0 505.9
web(293) b=0,s=0 0.3517 0.3517 74.58% 74.56% 74.56% 0 0 7361 74.89% 74.88% 0 0




Support Vectors(S/V's) for each dataset by compression technique.
Compression Ratios Bit Reduction Results Random Sampling Charts
Data Set Bit Red Level s=0 brsvm SVM s=0 BRSVM s=0 BRSVM Accuracy S/V's
adult(105) b=9,s=4 0.7875 0.6339 10580 8131 6306 8330.4 6076.7
banana(2) b=8,s=0 0.0767 0.0767 1055 174 174 132.5 132.5
forest(54) b=10,s=4 0.1805 0.0757 57125 8608 3413 10319.2 4330.4
letter(16) b=9,s=13 0.8868 0.6536 7051 6778 5758 6491.2 5306.8
page(10) b=9,s=3 0.1800 0.1122 512 282 211 130.3 88.2
pendigit(16) b=9,s=11 0.9273 0.6210 955 955 898 924.2 784.2
phoneme(5) b=8,s=1 0.6859 0.6093 1356 1164 1090 995.2 905.5
plankton(17) b=9,s=8 0.9608 0.7059 2628 2547 1815 2547.8 1959.3
satimage(36) b=9,s=28 0.9896 0.8329 1603 1603 1458 1588.5 1407.3
shuttle(9) b=8,s=5 0.0310 0.0189 404 304 302 139.7 122.1
waveform(40) b=11,s=22 0.4495 0.1845 1887 1001 431 1006.0 505.9
web(293) b=0,s=0 0.3517 0.3517 0 0 7361 0.0 0.0