ArticlePDF Available

LASVM Applied to Invariant Problems

Authors:
LASVM applied to invariant problems
Ga¨
elle Loosli, L´
eon Bottou∗∗,
St´
ephane Canu
gaelle.loosli@insa-rouen.fr
∗∗ NEC Laboratories America
Lab. Perception, Syst`
emes, Information - CNRS - FRE 2645
NIPS Workshop on Large Scale Kernel Machines
Ga¨
elle Loosli LASVM applied to invariant problems 1/20
Invariances Invariant LASVM Experiments Conclusion
Roadmap
1Invariances
2Invariant LASVM
3Experiments
4Conclusion
Ga¨
elle Loosli LASVM applied to invariant problems 2/20
Invariances Invariant LASVM Experiments Conclusion
1Invariances
Definitions and tools
Learning with invariances
2Invariant LASVM
3Experiments
4Conclusion
Ga¨
elle Loosli LASVM applied to invariant problems 3/20
Invariances Invariant LASVM Experiments Conclusion
Definitions for deformations
If xrepresents a 3, then T(x,d)should also represent a 3:
Tangent vectors
Deformation fields
Regular
Noisy (smoothed)
Ga¨
elle Loosli LASVM applied to invariant problems 4/20
Invariances Invariant LASVM Experiments Conclusion
Definitions for deformations
If xrepresents a 3, then T(x,d)should also represent a 3:
Tangent vectors
Deformation fields
Regular
Noisy (smoothed)
Ga¨
elle Loosli LASVM applied to invariant problems 4/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations)
Thickening
Elastic deformations
xaffine(i,j) = x(i,j) + αxdxtx(i,j) + αydyty(i,j)
Rotations with tangent vectors
Translations with tangent vectors
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations)
Thickening
Elastic deformations
xthick (i,j) = x(i,j) + βqtx(i,j)2+ty(i,j)2
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations)
Thickening
Elastic deformations
xdeformed (i,j) =
x(i,j)+fx(i,j)tx(i,j)+fy(i,j)ty(i,j)
P. Y. Simard, Y. LeCun, J. S. Denker, and B. Victorri.
Transformation invariance in pattern recognition – tangent distance and tangent propagation.
International Journal of Imaging Systems and Technology, 11(3), 2000.
Patrice Y. Simard, Dave Steinkraus, and John C. Platt.
Best practices for convolutional neural networks applied to visual document analysis.
In ICDAR ’03: Proceedings of the Seventh International Conference on Document Analysis and Recognition,
page 958, Washington, DC, USA, 2003. IEEE Computer Society.
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations) [SLDV00]
Thickening
Elastic deformations [SSP03]
Random deformation
xtransformed (i,j) = x(i,j)+α(fx(i,j)tx(i,j) +fy(i,j)ty(i,j))+ βqtx(i,j)2+ty(i,j)2
What do we need?
Tangent vectors - linear deformations (tx,ty)
Deformation fields (regular or noisy) (fx,fy)
Deformation strength parameters (α, β)
We add 1 and 2 pixels translations any 8 directions
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations)
Thickening
Elastic deformations
Random deformation
xtransformed (i,j) = x(i,j)+α(fx(i,j)tx(i,j) +fy(i,j)ty(i,j))+ βqtx(i,j)2+ty(i,j)2
What do we need?
Tangent vectors - linear deformations (tx,ty)
Deformation fields (regular or noisy) (fx,fy)
Deformation strength parameters (α, β)
We add 1 and 2 pixels translations any 8 directions
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations)
Thickening
Elastic deformations
Random deformation
xtransformed (i,j) = x(i,j)+ α(fx(i,j)tx(i,j) +fy(i,j)ty(i,j)) +βqtx(i,j)2+ty(i,j)2
What do we need?
Tangent vectors - linear deformations (tx,ty)
Deformation fields (regular or noisy) (fx,fy)
Deformation strength parameters (α, β)
We add 1 and 2 pixels translations any 8 directions
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
What to use for invariances?
Affine deformations (linear approximations)
Thickening
Elastic deformations
Random deformation
xtransformed (i,j) = x(i,j)+ α(fx(i,j)tx(i,j) +fy(i,j)ty(i,j)) +βqtx(i,j)2+ty(i,j)2
What do we need?
Tangent vectors - linear deformations (tx,ty)
Deformation fields (regular or noisy) (fx,fy)
Deformation strength parameters (α, β)
We add 1 and 2 pixels translations any 8 directions
Ga¨
elle Loosli LASVM applied to invariant problems 5/20
Invariances Invariant LASVM Experiments Conclusion
How to incorporate invariances in learning?
Invariances with SVM methods
Modify the cost function
Learn trajectories
Add some deformed points
Modify the cost function - Tangent kernels [HK02, CS02]
Changes the distance between elements using tangent distance
O. Chapelle and B. Sch¨
olkopf.
Incorporating invariances in nonlinear svms.
In Dietterich T. G.and Becker S. and Ghahramani Z., editors, Advances in Neural Information Processing
Systems, volume 14, pages 609–616, Cambridge, MA, USA, 2002. MIT Press.
B. Haasdonk and D. Keysers.
Tangent distance kernels for support vector machines.
In International Conference on Pattern Recognition, Quebec City, Canada, August 2002.
Ga¨
elle Loosli LASVM applied to invariant problems 6/20
Invariances Invariant LASVM Experiments Conclusion
How to incorporate invariances in learning?
Invariances with SVM methods
Modify the cost function
Learn trajectories
Add some deformed points
Learn trajectories -
SDPM [GH04]
The idea is to classify the
trajectories defined by a
transformation with
continuous parameter.
Thore Graepel and Ralf Herbrich.
Invariant pattern recognition by semi-definite programming machines.
In Sebastian Thrun, Lawrence Saul, and Bernhard Sch ¨
olkopf, editors, Advances in Neural Information
Processing Systems 16. MIT Press, Cambridge, MA, 2004.
Ga¨
elle Loosli LASVM applied to invariant problems 6/20
Invariances Invariant LASVM Experiments Conclusion
How to incorporate invariances in learning?
Invariances with SVM methods
Modify the cost function
Learn trajectories
Add some deformed points
Database modification
- Virtual SVM [DS02]
Adds deformed examples
to the database
D. DeCoste and B. Sch¨
olkopf.
Training invariant support vector machines.
Machine Learning, 46:161–190, 2002.
Ga¨
elle Loosli LASVM applied to invariant problems 6/20
Invariances Invariant LASVM Experiments Conclusion
How to incorporate invariances in learning?
Invariances with SVM methods
Modify the cost function
Learn trajectories
Add some selected deformed points
Combination of
trajectories and virtual
points - [LCVJ05]
Uses discretized
trajectories. The idea is to
add to the database only
virtual points that are SV
Ga¨
elle Loosli, St´
ephane Canu, SVN Vishwanathan, and Alexander J.Smola.
Invariances in classification : an efficient svm implementation.
In ASMDA 2005 -Applied Stochastic Models and Data Analysis, 2005.
Ga¨
elle Loosli LASVM applied to invariant problems 6/20
Invariances Invariant LASVM Experiments Conclusion
How to incorporate invariances in learning?
Good and bad
Method Pro Cons
Tangent dis-
tance
Efficient Linear deformations only
Trajectories
learning
All deformations SDP - hard to solve
Virtual vectors Simple, all deformations Requires a lot of memory
Selected virtual
vectors
Simple, all deformations Not yet fast enough (simpleSVM
limitations)
Objective
Apply the selected virtual vectors idea to a more adapted algorithm,
namely LASVM
Ga¨
elle Loosli LASVM applied to invariant problems 7/20
Invariances Invariant LASVM Experiments Conclusion
How to incorporate invariances in learning?
Good and bad
Method Pro Cons
Tangent dis-
tance
Efficient Linear deformations only
Trajectories
learning
All deformations SDP - hard to solve
Virtual vectors Simple, all deformations Requires a lot of memory
Selected virtual
vectors
Simple, all deformations Not yet fast enough (simpleSVM
limitations)
Objective
Apply the selected virtual vectors idea to a more adapted algorithm,
namely LASVM
Ga¨
elle Loosli LASVM applied to invariant problems 7/20
Invariances Invariant LASVM Experiments Conclusion
1Invariances
2Invariant LASVM
Definitions
Algorithm
3Experiments
4Conclusion
Ga¨
elle Loosli LASVM applied to invariant problems 8/20
Invariances Invariant LASVM Experiments Conclusion
Helpful definitions and generalities
Groups of points : IA(active set) and I0(inactive set)
Selection : pick a point from I0to transfer it to IA
Optimization : over IA, may transfer a point from IAto I0
Finalization : checks the final solution IA
Ga¨
elle Loosli LASVM applied to invariant problems 9/20
Invariances Invariant LASVM Experiments Conclusion
Helpful definitions and generalities
Groups of points : IA(active set) and I0(inactive set)
Selection : pick a point from I0to transfer it to IA
Optimization : over IA, may transfer a point from IAto I0
Finalization : checks the final solution IA
Ga¨
elle Loosli LASVM applied to invariant problems 9/20
Invariances Invariant LASVM Experiments Conclusion
Helpful definitions and generalities
Groups of points : IA(active set) and I0(inactive set)
Selection : pick a point from I0to transfer it to IA
Optimization : over IA, may transfer a point from IAto I0
Finalization : checks the final solution IA
Ga¨
elle Loosli LASVM applied to invariant problems 9/20
Invariances Invariant LASVM Experiments Conclusion
Helpful definitions and generalities
Groups of points : IA(active set) and I0(inactive set)
Selection : pick a point from I0to transfer it to IA
Optimization : over IA, may transfer a point from IAto I0
Finalization : checks the final solution IA
Ga¨
elle Loosli LASVM applied to invariant problems 9/20
Invariances Invariant LASVM Experiments Conclusion
Helpful definitions and generalities
Groups of points : IA(active set) and I0(inactive set)
Selection : pick a point from I0to transfer it to IA
Optimization : over IA, may transfer a point from IAto I0
Finalization : checks the final solution IA
iterative SVM
Initialize
While selection is possible
Selection
Optimization
Finalize
Ga¨
elle Loosli LASVM applied to invariant problems 9/20
Invariances Invariant LASVM Experiments Conclusion
Algorithms - iterative methods
General loop
General SimpleSVM [VSM03] LASVM [BEWB05]
Selection Takes the most violator
point in I0
Takes a point among the few next
points (according to a criteria)
Optimization Makes a full optimiza-
tion over IA
One SMO step between the candi-
date and a point from IA(Process)
and one SMO step between two
points of IA(Reprocess)
Finalization Once no points are vio-
lators anymore, stops
Once all points are seen once,
makes a full optimization over IA
(end of an epoch)
Antoine Bordes, Seyda Ertekin, Jason Weston, and L´
eon Bottou.
Fast kernel classifiers with online and active learning.
Journal of Machine Learning Research, 6:1579–1619, September 2005.
S. V. N Vishwanathan, A. J. Smola, and M. Narasimha Murty.
SimpleSVM.
In Proceedings of the Twentieth International Conference on Machine Learning, 2003.
Ga¨
elle Loosli LASVM applied to invariant problems 10/20
Invariances Invariant LASVM Experiments Conclusion
LASVM specificities
Things to play with
Selection criteria
Number of Reprocess (level of optimization at each step)
Number of epochs
Complete - brute force
Gradient
Active - distance to the margins
Auto-active
Ga¨
elle Loosli LASVM applied to invariant problems 11/20
Invariances Invariant LASVM Experiments Conclusion
LASVM specificities
Things to play with
Selection criteria
Number of Reprocess (level of optimization at each step)
Number of epochs
Complete - brute force
Gradient
Active - distance to the margins
Auto-active
Complete
Every point is selected as a candidate to the active set
Ga¨
elle Loosli LASVM applied to invariant problems 11/20
Invariances Invariant LASVM Experiments Conclusion
LASVM specificities
Things to play with
Selection criteria
Number of Reprocess (level of optimization at each step)
Number of epochs
Complete - brute force
Gradient
Active - distance to the margins
Auto-active
Gradient
Looks at the znext points, selects the most misclassified point
Ga¨
elle Loosli LASVM applied to invariant problems 11/20
Invariances Invariant LASVM Experiments Conclusion
LASVM specificities
Things to play with
Selection criteria
Number of Reprocess (level of optimization at each step)
Number of epochs
Complete - brute force
Gradient
Active - distance to the margins
Auto-active
Active
Selects a point if it is between the margins (±δ/2)
Ga¨
elle Loosli LASVM applied to invariant problems 11/20
Invariances Invariant LASVM Experiments Conclusion
LASVM specificities
Things to play with
Selection criteria
Number of Reprocess (level of optimization at each step)
Number of epochs
Complete - brute force
Gradient
Active - distance to the margins
Auto-active
Auto-active
Looks at the next points, keeps the points selected by the active rule,
until 5 points are kept or 100 points checked. Selects the closest
point to the decision boundary
Ga¨
elle Loosli LASVM applied to invariant problems 11/20
Invariances Invariant LASVM Experiments Conclusion
1Invariances
2Invariant LASVM
3Experiments
On memory usage
Which settings?
Results
4Conclusion
Ga¨
elle Loosli LASVM applied to invariant problems 12/20
Invariances Invariant LASVM Experiments Conclusion
What is stored?
0 0.6 1.2 1.8 2.4x 105
Original database
Tangent vectors
Original Dataset Dataset transformed with
parameters α1 and β1
Dataset transformed with
parameters α2 and β2Dataset transformed with
parameters α3 and β3
Etc...
Stored in memory:
Infinite sized database
Vertical
Horizontal Stored:
3 times the size of
original dataset
Available:
as many example as
wanted − with correct
deformation parameters
Memory usage
We only need to store 3 times the size of the original database. For
convenience, we also store some pre-generated fields.
Ga¨
elle Loosli LASVM applied to invariant problems 13/20
Invariances Invariant LASVM Experiments Conclusion
What is stored?
0 0.6 1.2 1.8 2.4x 105
Original database
Tangent vectors
Original Dataset Dataset transformed with
parameters α1 and β1
Dataset transformed with
parameters α2 and β2Dataset transformed with
parameters α3 and β3
Etc...
Stored in memory:
Infinite sized database
Vertical
Horizontal Stored:
3 times the size of
original dataset
Available:
as many example as
wanted − with correct
deformation parameters
Memory usage
We only need to store 3 times the size of the original database. For
convenience, we also store some pre-generated fields.
Ga¨
elle Loosli LASVM applied to invariant problems 13/20
Invariances Invariant LASVM Experiments Conclusion
What are the relevant deformations for our task?
0 0.5 1 1.5 2 2.5 3 3.5 4
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
strengh of applied transformations, α
error rate
Effect of transformations, trained on 5000 points and 10 transformations each
fields only
fields + 1px translation
fields + 2px translation
fields + thickenning
fields + thickenning + 1px translation
fields + thickenning + 2px translation
best configuration
On the deformations
Translations are the most useful deformations for MNIST. Thickening
does not help for MNIST database. β=0
Ga¨
elle Loosli LASVM applied to invariant problems 14/20
Invariances Invariant LASVM Experiments Conclusion
What are the relevant deformations for our task?
0 0.5 1 1.5 2 2.5 3 3.5 4
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
strengh of applied transformations, α
error rate
Effect of transformations, trained on 5000 points and 10 transformations each
fields only
fields + 1px translation
fields + 2px translation
fields + thickenning
fields + thickenning + 1px translation
fields + thickenning + 2px translation
best configuration
On the deformations
Translations are the most useful deformations for MNIST. Thickening
does not help for MNIST database. β=0
Ga¨
elle Loosli LASVM applied to invariant problems 14/20
Invariances Invariant LASVM Experiments Conclusion
Which mode will we use?
None BF A AA
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Modes for LASVM
Error percentage
Error rate
None BF A AA
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 105Size of the solution
Modes for LASVM
Number of support vectors
None BF A AA
0
2
4
6
8
10
12 x 105
Modes for LASVM
Training time (s)
Training time
None : No
invariances
BF: Brute Force
A : Active
AA : Auto acitve
Ga¨
elle Loosli LASVM applied to invariant problems 15/20
Invariances Invariant LASVM Experiments Conclusion
does it help to use those deformations?
On the efficiency
Adding some deformed examples slowly (but surely) improves
the solution.
Ga¨
elle Loosli LASVM applied to invariant problems 16/20
Invariances Invariant LASVM Experiments Conclusion
does it help to use those deformations?
On the efficiency
Adding some deformed examples slowly (but surely) improves
the solution.
Ga¨
elle Loosli LASVM applied to invariant problems 16/20
Invariances Invariant LASVM Experiments Conclusion
Optimal results
Which transformations : all but thickening
Training data size : 8 millions
Solution sizes : about 120000 SV for the 10 classifiers
Computational time : 8 days
Performance : 0.67 %
Ga¨
elle Loosli LASVM applied to invariant problems 17/20
Invariances Invariant LASVM Experiments Conclusion
Optimal results
Which transformations : all but thickening
Training data size : 8 millions
Solution sizes : about 120000 SV for the 10 classifiers
Computational time : 8 days
Performance : 0.67 %
Ga¨
elle Loosli LASVM applied to invariant problems 17/20
Invariances Invariant LASVM Experiments Conclusion
Optimal results
Which transformations : all but thickening
Training data size : 8 millions
Solution sizes : about 120000 SV for the 10 classifiers
Computational time : 8 days
Performance : 0.67 %
Ga¨
elle Loosli LASVM applied to invariant problems 17/20
Invariances Invariant LASVM Experiments Conclusion
Optimal results
Which transformations : all but thickening
Training data size : 8 millions
Solution sizes : about 120000 SV for the 10 classifiers
Computational time : 8 days
Performance : 0.67 %
Ga¨
elle Loosli LASVM applied to invariant problems 17/20
Invariances Invariant LASVM Experiments Conclusion
Optimal results
Which transformations : all but thickening
Training data size : 8 millions
Solution sizes : about 120000 SV for the 10 classifiers
Computational time : 8 days
Performance : 0.67 %
NB: the machine used is a dual opteron with 16GB RAM
(6.5GB cache used for kernel)...
Ga¨
elle Loosli LASVM applied to invariant problems 17/20
Invariances Invariant LASVM Experiments Conclusion
1Invariances
2Invariant LASVM
3Experiments
4Conclusion
Ga¨
elle Loosli LASVM applied to invariant problems 18/20
Invariances Invariant LASVM Experiments Conclusion
Conclusion
We wanted to solve invariance problem
We needed to solve large SVM to do so
As a result...
We ran some very large SVMs (the largest until now?)
We obtained fairly good results in accuracy (even though
convolution networks are still much better)
Ga¨
elle Loosli LASVM applied to invariant problems 19/20
Invariances Invariant LASVM Experiments Conclusion
Conclusion
We wanted to solve invariance problem
We needed to solve large SVM to do so
As a result...
We ran some very large SVMs (the largest until now?)
We obtained fairly good results in accuracy (even though
convolution networks are still much better)
Ga¨
elle Loosli LASVM applied to invariant problems 19/20
Invariances Invariant LASVM Experiments Conclusion
Conclusion
We wanted to solve invariance problem
We needed to solve large SVM to do so
As a result...
We ran some very large SVMs (the largest until now?)
We obtained fairly good results in accuracy (even though
convolution networks are still much better)
Ga¨
elle Loosli LASVM applied to invariant problems 19/20
Invariances Invariant LASVM Experiments Conclusion
Conclusion
We wanted to solve invariance problem
We needed to solve large SVM to do so
As a result...
We ran some very large SVMs (the largest until now?)
We obtained fairly good results in accuracy (even though
convolution networks are still much better)
Ga¨
elle Loosli LASVM applied to invariant problems 19/20
Invariances Invariant LASVM Experiments Conclusion
Conclusion
We wanted to solve invariance problem
We needed to solve large SVM to do so
As a result...
We ran some very large SVMs (the largest until now?)
We obtained fairly good results in accuracy (even though
convolution networks are still much better)
Ga¨
elle Loosli LASVM applied to invariant problems 19/20
Invariances Invariant LASVM Experiments Conclusion
Perspective
Challenge
Yann’s challenge on stereo pictures?
Ga¨
elle Loosli LASVM applied to invariant problems 20/20
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.